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PREFACE 


This  book  is  concerned  with  a  systematic  investigation  of  the  concept  of 
"measure-free"  conditioning  and  its  associated  logic  for  intelligent  systems.  Its  purpose  is 
to  provide  a  foundation  for  inference  in  such  systems.  The  basic  problem  is  the 
representation  and  evaluation  of  implicative  statements  in  natural  language,  in  a  way 
compatible  with  conditional  probability.  This  longstanding  problem  involves  three  distinct 
disciplines:  natural  language,  logic,  and  probability.  The  results  are  organized  in  book 
form  here  for  the  first  time. 

Two  audiences  are  in  mind.  Artificial  Intelligence  researchers  who  are  primarily 
interested  in  reasoning  under  uncertainty  in  intelligent  systems,  and  mathematicians  in  the 
fields  of  probabilistic  modeling,  and  logic.  This  diversity  of  audience  requires  that  some 
sections  be  tutorial  and  elementary  in  nature. 

Specifically,  this  work  bridges  the  gap  between  numerically  based  probabilistic 
conditioning  and  the  logic  underlying  implicative  statements  in  natural  language.  This 
problem  has  been  addressed  in  the  past,  for  example,  by  Boole,  DeFinetti,  Koopman, 
Copeland,  Schay,  and  Adams.  Those  efforts  are  incomplete,  perhaps  because  of  lack  of 
motivation  by  real  world  problems.  In  any  case,  work  in  this  field  has  gone  unrecognized 
by  the  mainstream  of  researchers,  particularly  the  work  of  Schay  in  1968  on  the  algebra  of 
conditional  events,  which  remains  almost  totally  uncited  in  the  literature. 

The  simation  is  different  today.  The  problem  is  before  us  because  of  the  need  to 
provide  a  firm  foundation  for  probabilistic  reasoning  in  intelligent  systems;  in  particular, 
how  to  combine  conditional  information  arising  from  disparate  sources  in  expen  systems 
and  how  to  compute  it  probabilistically.  This  is  in  line  with  the  Bayesian  approach  to 
probabilistic  reasoning  in  intelligent  systems  (Pearl,  1988).  Probability  not  only  has  a  firm 
mathematical  foundation,  but  also  the  conditional  probability  operator  captures  a  form  of 
non-monotonicity  of  common  sense  reasoning. 

Our  goal  is  a  more  complete  and  satisfactory  theory  of  "measure-free"  conditioning. 
If  the  concept  of  "conditional  event"  can  be  formalized  and  a  suitable  algebra  of  operators 
between  such  events  be  developed,  then  the  resulting  structure  will  have  use  in  designing 
inference  rules  in  expen  systems.  With  probability  being  the  method  of  choice  for 
handling  uncertainty  despite  the  plethora  of  non-probabilistic  procedures  such  as 
Dempster-Shafer  belief  functions  and  Zadeh's  fuzzy  sets,  it  is  natural  to  develop  a  logic  of 
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conditional  events  logic  compatible  with  conditional  probabilities.  However,  the  basic 
work  here  can  be  adapted  and  extended  in  various  directions,  such  as  to  the  fuzzy  set 
setting  (Chapter  7),  as  well  as  to  the  Dempster-Shafer  belief  function  setting  (see,  for 
example,  Dubois  and  Prade  (1988)).  This  development  is  not  to  be  confused  with  other 
"conditional  logics",  such  as  that  of  Nute  (1980)  and  Appiah  (1985),  which  are  not 
compatible  with  conditional  probability,  nor  with  non-commutative  extensions  of  Boolean 
logic  (Guzman  and  Squier,  1990).  Our  approach  differs  also  from  that  of  Adams  (1975), 
who  takes  conditionals  as  primitives  in  natural  language,  while  ours  are  mathematical 
entities. 

This  book  is  primarily  concerned  with  theory.  The  reader  is  expected  to  be  familiar 
with  basic  probability  theory,  elementary  logic,  and  elementary  facts  from  ring  theory. 
However,  the  text  is  largely  self-contained.  The  hope  is  that  this  book  will  trigger  further 
interest  in  both  the  theory  and  applications  of  this  topic. 

In  conducting  the  research  leading  to  this  Monograph,  we  have  benefited  from 
discussions  with  various  people.  In  particular,  acknowledgements  are  expressed  to  Dr. 
Philip  Calabrese  for  his  thought  provoking  treatise  on  measure-free  conditional  events 
(Calabrese,  1987),  and  the  lengthy  personal  communications  exchanged  on  the  topic. 
Thanks  are  extended  to  Professors  Geza  Schay  of  the  University  of  Massachusetts  at 
Boston,  Kevin  Hestir  and  Gerald  Rogers  of  New  Mexico  State  University,  and  to  David 
Stein  of  the  Naval  Ocean  Systems  Center  at  San  Diego. 

The  first  named  author  expresses  his  appreciation  for  support  by  Dr.  Ralph  Wachter 
of  the  Office  of  Naval  Research,  Dr.  Alan  Gordon  of  the  Independent  Research  Office, 
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Control  Department 
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CHAPTER  0 
INTRODUCTION 

In  this  Introduction,  we  outline  the  motivation  and  objectives,  as  well  as  the  main 
contributions  to  the  theory  of  measure-free  conditioning. 

0.1  Motivation  and  objectives 

This  work  addresses  an  anomaly  involving  probability  and  logic  relative  to  the 
interpretation  of  implicadve  statements,  and  the  evaluation  of  those  statements  compatible 
with  conditional  probability.  One  of  our  chief  motivations  is  the  need  to  formalize 
rigorously  the  connections  between  conditional  probability  and  the  "hidden"  logic  of 
implicative  statements,  such  as  production  rules  in  expert  systems  and  defaults  in 
common-sense  reasoning.  The  purpose  is  to  provide  theoretical  results  for  probabilistic 
reasoning  that  will  be  useful  in  the  design  and  evaluation  of  inference  rules  of  such 
systems. 

We  now  describe  the  basic  problem  in  some  detail.  Within  the  context  of 
logic-based  formal  methods  in  artificial  intelligence,  the  space  of  propositions  (facts, 
evidence,  information,  and  so  on)  is  represented  by  an  algebraic  structure  R  known  as  a 
Boolean  algebra.  The  basic  connectives  among  propositions,  namely  negation, 
conjunction,  and  disjunction  correspond  to  operators  on  R ,  denoted  A,  and  V, 
respectively.  When  elements  of  R  arc  uncertain,  as  is  often  the  case  in  expen  systems, 
classical  two-valued  logic  has  to  he  replaced  by  probability  logic,  in  which  probabilities 
play  the  role  of  truth  values.  However,  our  knowledge  often  contains  uncertain 
conditional  information  of  the  form  "if  b  then  a",  where  a  and  b  are  elements  of  R.  These 
conditional  propositions  are  referred  to  as  implicative  statements,  or  conditionals.  In 
expert  systems,  these  are  "production  rules".  In  order  to  make  inferences  from  this  type  of 
knowledge,  it  is  necessary  to  develop  an  appropriate  logic  in  which  these  conditionals  can 
be  represented  and  manipulated  in  order  to  combine  evidence,  and  in  which  an  entailment 
relation  can  be  formulated.  A  quantitative  approach  to  this  starts  with  the  quantification 
of  the  strength  or  the  "truth”  of  conditionals.  For  example,  if  the  conditional  "if  b  then  a" 
is  written  in  the  language  of  Boolean  logic,  then  one  can  model  it  by  material  implication, 
that  is 
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b-*a  =  b'  V  a. 

If  P  is  a  probability  measure  on  R,  then  P(b  -»  a)  =  P(b'  V  a)  can  be  used  as  such  a 
quantification.  However,  it  is  more  reasonable  to  quantify  the  conditional  "if  b  then  a"  by 
the  conditional  probability  P(a  [  b),  which  is  clearly  different  from  P(jb'  V  a).  Indeed 

P(b'  V  a)  =  P(b'  V  ab) 

=  1  -  P{b)  +  P(ab)  *  P(ab)IP(b). 

While  this  is  consistent  with  probability  logic  for  unconditional  propositions,  that  is,  for 
elements  of  R,  one  cannot  represent  the  conditional  "if  b  then  a"  mathematically. 
Indeed,  there  is  no  counterpart  of  P(a  |  b)  in  logic.  Logic  lacks  a  conditioning  operator 
corresponding  to  conditional  probability.  Since  material  implication  b  -*  a  is  not 
compatible  with  probability  in  the  sense  that 

P(b-*a)*P(a\b), 

one  might  attempt  to  look  for  other  operations  /  on  R,  Boolean  or  not,  such  that 
PlfiaJj)}  =  P(a\b)  for  all  probabilities  P  on  R  and  all  a,  b  e  R  with  b  ^  0.  Such 
attempts  have  been  laid  to  rest  by  Lewis'  Triviality  Result.  (See  Chapter  1.)  To  model 
"measure- free”  conditional  events  (fl\b)  compatible  with  conditional  probability,  one  has 
to  go  outside  of  R.  Thus  (a\b)  cannot  be  so  modeled  as  an  ordinary  proposition. 

The  first  question  then  is  to  determine  a  suitable  mathematical  entity  R\R  for 
conditional  events  (a|b).  Once  such  a  model  R\R  is  determined,  for  each  probability  P 
on  R,  one  has  P  extended  to  a  "semantic  evaluation"  on  R\R. 

With  the  space  R\R  as  the  counterpart  of  R  in  the  unconditional  case,  one  then 
proceeds  to  define  connectives  among  conditionals,  for  example  conjunctions 

(if  b  then  a)  A  (if  d  then  c) 

whose  result  is  another  conditional  in  Such  operations  yield  an  algebra  of 

conditionals,  extending  the  algebra  of  unconditional  events  of  R.  Choosing  a  correct 
model  /?{/?  for  these  conditional  events,  and  then  choosing  suitable  logical  operations  on 
those  conditional  events  which  extend  those  of  R  is  the  backbone  of  the  problem.  Once 
such  operations  have  been  found,  it  is  then  possible  to  assign  probabilities  to  compounds 
of  conditionals,  since,  for  example,  to  evaluate 

P((if  b  then  a)  A  (if  d  then  c)], 

one  merely  has  to  carry  out  the  operation  A  between  the  two  conditionals,  yielding 
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another  conditional,  and  then  evaluate  P  at  that  conditional.  The  algebraic  structure  of 
R\R  together  with  a  probability  on  R  extended  to  R\R  forms  the  core  for  the 
development  of  conditional  probability  logic  extending  that  of  probability  logic. 

In  summary,  the  problem  we  are  facing  is  this.  For  a  Boolean  algebra  R{\  A,  V), 

(1)  find  a  "measure-free"  conditioning  map  /  from  RxR  to  some  space  R\R  so 
that  P[f{a,bj]  =  P(a\b)  defines  a  function  on  R\R  extending  P.  on R] 

(2)  define  logical  operations  A,  V  on  R\R  extending  the  corresponding  ones  on 
R,  and 

(3)  with  conditional  probabilities  as  semantic  evaluations,  develop  a  conditional 
probability  logic  with  syntax  (R\R,  ',  A,  V). 

No  satisfactory  solution  to  the  problem  seems  to  exist,  even  in  the  vast  numerically 
oriented  literature  treating  conditioning  in  probability  and  logic.  A  solution  entails  the 
development  of  "conditional  event  algebras",  and  lies  outside  the  scope  of  conditional 
probability  literature.  This  aspect  has  been  considered  by  only  a  handful  of  researchers, 
with  no  concerted  effort  being  made  in  that  direction.  In  this  monograph,  we  present  a 
solution  to  the  problem  in  the  form  of  a  conditional  events  algebra  that  is  new,  rigorous, 
comprehensive,  and  computationally  tractable.  The  theory  of  measure-free  conditioning 
presented  here  can  be  used  both  as  a  basis  for  treating  the  problem  of  combining  evidence 
and  as  groundwork  for  further  investigations  into  the  connection  between  probability  and 
logic. 

We  now  return  to  the  topic  of  inference  rules  in  expert  and  intelligent  systems  as 
one  of  the  main  motivating  sources  for  posing  the  basic  problem  mentioned  above. 
Automated  reasoning  in  intelligent  systems  is  based  on  logical  entailment  (or  logical 
consequences  or  implication)  in  some  logic.  For  example,  in  mechanical  theorem  proving 
where  first  order  logic  is  used,  one  of  the  usual  ways  to  draw  conclusions  is  through  the 
use  of  modus  ponens,  which  simply  says  that  if  b  implies  a  and  b  is  true,  then  a  is  true. 
Tnis  means  that  a  follows  logically  from  [b  -*  a,  b),  and  this  translates  into  the  syntax  of 
first  order  logic  as  (b  -*  a)  A  b  <  a.  Note  that  here  <  is  precisely  the  logical  entailment 
relation  of  first  order  logic,  and  the  modeling  of  ,  conditional  information  of  the  form 
"if  b  then  a",  is  via  material  implication  mentioned  above. 

The  situation  in  reasoning  under  uncertainty  is  more  complicated.  First,  the 
knowledge  base  consists  of  conditional  information  which  is  not  known  with  complete 
certainty.  Second,  human  common  sense  reasoning  is  basically  "non-monotonic”  in 
nature,  whereas  first  order  logic  is  monotone.  Tnis  means  that  one  can  retract  prior 
conclusions  in  light  of  new  evidence.  From  a  logical  non-nuir.crical  approach,  the 
modeling  of  ”if  6  then  a"  should  be  investigated,  and  a  non-monotonic  logic  for 
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"conditionals"  should  be  found.  A  well  known  example  is  Reiter's  (1980)  logic  of 
defaults.  If  we  want  to  treat  uncertainty  in  conditional  information  in  a  more  quantitative 
way,  various  uncertainty  measures  could  be  used.  The  most  popular  numerical  approach 
is  a  Bayesian  one,  in  which  probabilities  assigned  to  conditionals  are  conditional 
probabilities.  Suppose  we  symbolize  conditional  statements  of  the  form  "if  b  then  a"  or 
"most  b's  are  a's",  or  "usually  birds  fly”  by  (a|b).  Then  the  knowledge  K  is  of  the 
form  {telbj)  :  i  =  12,  ...,n),  and  the  evidence  is  of  the  form  E  =  (e$  , 

Non-monotonic  reasoning  is  a  logical  entailment  in  a  non-monotonic  logic  whose  basic 
objects  are  of  the  form  (a | b).  Note  that  the  elements  of  E  can  be  identified  as  (ejjft). 
Instead  of  trying  to  model  (a\b)  as  a  mathematical  entity  compatible  with  conditional 
probability  (as  a  counterpart  of  non-conditional  propositions  with  respect  to  unconditional 
probability),  a  well  known  approach  (for  example.  Pearl,  1988)  is  to  rely  upon  the 
so-called  Adams’  logic  of  conditionals  (Adams,  1975),  in  which  conditionals  are  not 
modeled  mathematically,  but  are  taken  as  primitives  in  our  natural  language,  and  the 
probability  entailment  relation  $  is  defined  semantically.  The  lack  of  a  conditioning 
operator  in  logic  is  mentioned  in  many  places  in  Pearl's  book.  Moreove*.  if  a 
mathematical  object  (a|d)  could  be  defined,  many  problems  in  Adams'  book  could  be 
clarified.  It  is  interesting  to  note  that  in  1968  Shay  published  a  paper  providing  a  proposal 
for  such  an  object  (a\b)  and  its  algebra.  Definitely,  if  objects  like  (a\b)  can  be 
defined,  then  we  can  bridge  the  gap  between  probability  and  logic  and  reasoning  can  be 
carried  out  at  the  syntax  level  providing  that  conditional  information  can  be  combined. 

Thus  the  goal  is  to  develop  a  theory  of  "conditional  events"  compatible  with 
conditional  probability,  analogous  to  the  role  played  by  boolean  algebra  in  the  theory  of 
unconditional  events  and  unconditional  probability.  Perhaps  by  the  very  nature  of 
physical  systems  and  statistical  problems,  the  new  concept  of  conditional  events  might  not 
contribute  anything  new  to  them.  This  might  explain  why  the  papers  by  Copeland 
published  in  the  Proceedings  of  the  Berkeley  Symposium  on  Mathematical  Statistics  and 
Probability  (1945,1954),  or  by  Scnay  (1968)  have  been  largely  ignored.  This  is  similar  to 
the  case  of  quantum  probability  for  quantum  mechanics  but  not  for  ordinary  probability- 
models  (Gudder,  1988).  The  need  for  defining  mathematically  (measure  free)  conditional 
events  appears  also  in  the  Theory  of  Measurement  (Pfanzagl.  1971,  Chapter  12).  But 
unlike  Copeland  s  approach,  Pfanzagl  proposed  to  use  cosets  of  Boolean  rings  to  represent 
conditional  events.  However,  his  analysis  was  restricted  only  to  each  fixed  (Boolean) 
quotient  ring,  so  that  the  algebraic  structure  of  the  space  of  all  possible  cosets  was  not 
investigated.  In  particular,  inference  from  a  collection  of  conditional  events  with  different 
antecedents  was  not  formulated.  But,  as  we  will  see  in  Chapter  2,  the  cosct  form  for 
conditional  events  is  a  correct  one,  and  this  will  be  derived  axiomatical!}-. 
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0.2  State-of-the-art 

The  mathematical  problem  that  we  try  to  analyze  in  this  book  has  been  examined 
over  several  decades,  but  is  apparently  foreign  to  probabilists  as  well  as  to  engineers. 
Most  of  the  results  were  published  in  a  scattered,  unorganized  fashion.  However,  there  are 
two  books  on  the  subject:  those  of  Adams  (1975)  and  Hailperin  (1976),  which  are  in  logic. 
See  also  the  book  of  Pfanzagl  (1971,  Chapter  12). 

Prior  to  the  era  of  AI,  the  problem  came  independently  to  the  attention  of  the 
logicians  Stalnaker  (1968)  and  Lewis  (1976),  and  as  well  as  to  Van  Fraasen  (1976), 
Copeland  (1941, 1945, 1950,  1954),  Koopman  (1940,  1941),  and  DeFinetti  (1974).  While 
the  discussion  of  the  subject  within  the  logic  community  remains  somewhat  active, 
perhaps  because  of  its  philosophical  nature,  there  was  no  reaction  at  all  in  the  probability 
and  statistics  community.  This  is  exemplified  by  the  largely  forgotten  Copeland's  papers 
which  aimed  at  providing  more  basic  structures  for  probability  theory  and  statistics, 
complementing  Kolmogorov's  model.  The  framework  that  he  proposed,  that  of 
implicative  Boolean  algebras,  was  unsatisfactory,  being  far  too  restrictive,  and  examples 
and  applications  were  not  readily  at  hand. 

At  the  folklore  or  unpublished  level,  all  of  the  attempts  to  deal  with  this  problem 
have  been  shown  to  be  either  patently  wrong  -  such  as  identifying  the  probability  of 
material  implication  with  conditional  probability,  or  combining  antecedents  with  only 
union  or  intersection  of  antecedents  being  taken,  or  using  a  too  restrictive  or 
computationally  unfeasible  approach  (see  Chapter  1). 

The  conditional  event  "literature"  consists  of  only  a  couple  of  dozen  papers  as 
opposed  to  the  vast  conditional  probability  literature.  Within  this  meager  output,  most 
researchers  have  reached  the  point  where  they  have  agreed  that  conditional  events  should 
be  identified  as  principal  ideal  cosets  of  events  of  the  original  Boolean  algebra  of  events. 
One  exception  is  Copeland  and  his  colleagues,  who  used  the  "implicative"  Boolean 
algebra  approach.  But  this  required  the  original  Boolean  algebra  to  be  infinite  and  of  a 
very  special  son.  Indeed,  an  "implicative"  Boolean  algebra  R  must  be  isomorphic  to  R/I 
for  all  principal  ideals  I  (Copeland  and  Harary,  1953a). 

Except  for  Domotor  (1969),  Pfanzagl  (1971)  and  Calabrese  (1987),  no  justification  is 
proffered  by  those  even  proposing  cosets  of  principal  ideals  as  models  for  conditional 
events.  On  the  other  hand,  Hailperin  postulated  that  a  conditional  event  should  be  an 
element  of  a  Chevalley-Uzkov  ring  of  fractions  of  a  Boolean  ring,  whose  elements  he  then 
shows  are  identifiable  with  cosets  of  principal  ideals  of  the  original  Boolean  ring.  Thus  he 
could  have  skipped  the  ring  of  fractions  step,  cosets  being  a  simpler  notion.  The  idea  is 
not  so  bad:  given  two  elements  a  and  b  of  a  ring,  with  b  *  0,  form  a  larger  ring  in 
.vhich  alb  makes  sense,  that  is,  in  which  a  is  divisible  by  b.  The  notion  is  to  model 
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the  conditional  event  (a  |  b)  by  the  element  alb.  For  Boolean  rings,  this  cannot  be  done, 
trying  to  "divide"  elements  of  Boolean  rings  results  in  trivialities.  For  example,  in  the 
larger  ring, 


abbialb)  =  aba  =  ab  =  ab(ajb)  =  aa-  a. 


But  ab  is  not  necessarily  a.  Further,  using  more  general  "rings  of  quotients"  will  also  lead 
nowhere.  (See  Section  1.2  for  more  details.) 

Among  the  few  who  have  attempted  to  define  operations  among  conditional  events 
with  different  antecedents  -  the  identical  antecedent  case  being  similar  to  the 
unconditional  case  -  only  Schay  (1968)  has  justified  his  choice  of  (two  proposed  systems 
of);operators,  and  that  indirectly  through  an  abstract  characterization  theorem.  (See  Schay 
(1968),  Theorem  5.)  However,  these  operators  are  chosen  initially  on  an  empirical  basis, 
and  the  characterization  theorem  appears  more  as  an  ad  hoc  rather  than  a  natural  avenue 
for  supporting  them. 

It  will  be  pointed  out  in  Chapter  3  that  both  pairs  of  Schay's  conjunction  and 
disjunction  operators  -  and  hence  Calabrese's  operators  since  they  coincide  with  one 
system  of  Schay's,  violate  the  min-conjunction  and  max-disjunction  and  related 
monotonicity  properties  of  probability.  Up  to  now,  no  one  has  derived  operations  on 
conditional  events  from  first  principles,  and  related  explicitly  the  coset  form  of  conditional 
events  to  their  potential  operations.  Even  further,  except  for  some  of  Mazurkiewicz's 
rudimentary  results  (see  Section  1.4),  no  connections  have  been  established  between  the 
coset  form  and  conditional  probability  assignment  of  conditional  events. 

As  we  will  see,  there  has  been  a  proliferation  of  definitions  for  conditional  events 
and  of  operations  between  them.  This  is  due  perhaps  to  the  fact  that  each  approach  is 
based  simply  on  some  intuitive  idea  or  some  mathematical  analogy  rather  than  a 
systematic  analysis  of  the  problem  from  basic  concepts,  or  a  more  axiomatic  approach. 

In  summary,  up  to  now  no  satisfactory  first-principles  approach  has  been  taken 
toward  the  exposition  of  a  theory  of  conditional  events.  Our  goal  is  such  a  theory. 

0.3  Outline  of  main  contributions 

With  the  motivation  and  objectives  described  above,  our  effort  will  be  directed  first 
toward  the  development  of  a  mathematically  rigorous  and  comprehensive  theory  of 
measure-free  conditioning.  Specifically,  a  conditioning  operator  compatible  with  the 
probabilistic  conditioning  operator  is  introduced  into  logic.  The  whole  machinery  of 
Boolean  logic  is  extended  to  "conditional  Boolean  logic".  With  this  conditional  Boolean 
logic  as  syntax,  the  associated  conditional  probability  logic  will  extend  classical 
probability  logic.  (See  Hailperin  (1984)  and  Nilsson  (1986).)  Since  conditional 


Outline  of  main  contributions 


1 


probability  logic  is  a  logic  for  implicative  propositions  (such  as  defaults  in  common  sense 
reasoning,  and  productions  rules  in  expert  systems),  our  work  makes  more  rigorous,  and 
goes  beyond,  that  of  Adams  (1975).  Further,  it  clarifies  theoretical  issues  in  algebraic 
logic  in  the  new  direction  of  non-monotonic  logics  for  AT.  The  mathematical  setting  of 
our  conditional  extension  of  first-order  logic  is  an  algebraic  structure  extending  the 
Boolean  ring  of  first  order  logic,  but  is  not  itself  a  ring.  This  is  however  compatible  with 
the  goal  of  achieving  non-monotonicity  in  probability  reasoning,  more  fundamental 
structures  surrounding  the  theory  of  probability  must  be  investigated,  as  has  been  pointed 
out  by  Grosof  (1988)  and  Pearl  (1988).  Thus,  structures  more  general  than  Boolean  rings 
must  be  allowed.  This  situation  is  somewhat  analogous  to  that  of  quantum  logic  (Gudder, 
1988).  This  need  to  consider  more  general  algebraic  structures  can  also  have  some 
interest  for  algebraists.  For  example,  combining  cosets  of  different  quotient  rings  of  a 
Boolean  ring  is  possible  in  a  natural  way,  and  the  resulting  algebraic  structure  merits 
attention.  The  generality  in  which  this  phenomenon  holds  is  not  clear,  although  it  does 
extend,  for  example,  to  commutative  von  Neumann  regular  rings.  (See  Chapter  8.)  A 
related  question  of  interest  here  which  arises  is  to  characterize  commutative  partially 
ordered  rings  in  which  cosets  of  principal  ideals  are  intervals. 

The  theory  of  conditioning  developed  in  this  book  can  be  used  to  design  inference 
rules  in  intelligent  machines.  Details  of  these  applications  to  AI  should  be  investigated. 
At  this  point,  we  give  some  flavor  of  the  theory.  We  begin  by  recalling  the  basics  of 
Adams'  logic  of  conditionals  (Adams,  1975),  which  has  been  popularized  in  the  AI 
community  by  Pearl  (1988).  Since  uncertain  implicative  propositions  in  natural  language 
form  the  core  of  human  and  machine  knowledge  used  in  reasoning  and  inference,  a  logic 
of  these  propositions,  called  conditionals,  needs  to  be  developed. 

The  main  thrust  of  Adams'  work  is  the  development  of  a  logic  of  conditionals, 
compatible  with  conditional  probability,  that  is,  probabilities  of  conditionals  are  taken  to 
be  conditional  probabilities.  See  also  (Stalnaker,  1968,  1970)  and  (Lewis,  1976).  In 
classical  two-valued  logic,  the  basic  structure  is  a  Boolean  algebra  R  of  subsets  of  a 
universe  of  discourse  Cl.  Thus  propositions  (events)  are  represented  as  mathematical 
entities,  namely  as  elements  of  R.  From  this,  semantics,  or  truth  values  are  attached  to  the 
"possible  worlds".  However,  Adams,  apparently  unaware  of  most  of  the  previous  work  on 
the  subject,  especially  that  of  Shay  (1968),  took  conditionals,  as  generalizations  of 
ordinary  events,  as  primitives  in  natural  language,  rather  than  some  entity  generalizing 
elements  in  a  Boolean  ring.  (It  is  interesting  to  speculate  on  what  Adams’  book  would  be 
like  had  he  known  of  Schay's  work  of  1968.)  Thus  in  Adams’  conditional  extension  of 
classical  logic,  the  collections  of  conditionals  exist  only  as  a  formal  mathematical 
structure.  However,  as  human  beings,  we  understand  this  primitive  concept  of 
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conditionals,  and  hence,  as  in  classical  logic,  proceed  to  build  more  complicated 
conditionals  from  the  simple  ones  via  logical  connectives  "and",  "or",  "not",  and  so  on. 
Now,  these  "conditional"  connectives  are  extensions  of  those  in  ordinary  unconditional 
propositions.  As  in  any  extension  problem,  the  solution  is  not  unique.  Any  proposed 
extension  of  the  logical  operations  for  conditionals  forms  only  one  possible  logic  amongst 
all  the  possible  ones.  Adams  proposed  the  following  ones  (1975,  pp.46-47).  Write  "if  b 
then  a "  as  a  \  b.  He  made  the  following  definitions,  perhaps  based  on  intuitive  grounds: 

<fl\b)'  =  (fi'\byt 

(a\b)  A  (c|d)  =  ((&'  V  a)  A  id'  V  c))\ (b  V  d); 

(a\b)  V  (c\d)  =  ((a  A  b)  V  (c  A  d))\(p  V  d). 

These  turn  out  to  be  precisely  Schay’s  operations  (Schay,  1968),  which  will  be  discussed 
in  Chapter  3. 

The  problem  with  assigning  probabilities  to  compounds  of  conditionals  is  discussed 
using  Lewis'  triviality  result,  which  says  that  one  cannot  model  conditionals  as  elements  of 
the  Boolean  ring  R,  compatible  with  conditional  probability  (Adams,  pp.  34-35;  Lewis, 
1976).  Precisely,  it  says  that  one  cannot  associate  with  "if  b  then  a"  an  element  <p(a,b)  of 
R  so  that  for  all  probability  measures  P  on  R, 

P(<p(a,b))=P(a\b)=P(ab)/P(b). 

There  are  some  trivial  exceptions.  A  proof  of  this  fact  will  be  reproduced  in  Chapter  1. 
This  means  that  the  mathematical  entity  modeling  conditionals  must  properly  contain  R. 
This  modeling  of  conditionals  is  the  main  thrust  of  this  book. 

Another  point  in  Adams’  work  is  his  concept  of  "probabilistic  entailment"  (Adams, 
1975,  pp.  56-57).  Since  reasoning  in  intelligent  systems  is  based  on  a  logical  entailment 
relation  in  a  given  logic,  it  is  not  surprising  that  Pearl  (1988)  popularized  Adams’  work 
because  of  this  concept  only.  This  concept  of  entailment  is  particularly  suitable  for 
plausible  reasoning  in  a  quantitative  way,  that  is,  for  conditionals  (a\b)  in  which  P(a | b) 
is  high,  such  as  "birds  fly".  Let  K  =  {(a;|fy) :  i  =  1,  2, ...,  n}.  Then,  by  definition,  K 
implies  (c\d)  if  for  each  e>  0,  there  is  a  S>  0  such  that  for  any  probability  measure 
P  on  R,  if  P(fli|6j)  >1-5  then  P{c\d)  >  1  -  £.  In  Chapter  6,  we  will  return  to  this 
concept  to  discuss  its  practical  role  in  automated  reasoning,  especially  in  situations 
different  from  plausible  reasoning,  as  well  as  in  the  computational  aspects  of  conditional 
probability  logic. 

Now  consider  the  problem  of  assigning  a  probability  to  a  compound  statement  of  the 
form  S  =  "if  b  then  a  or  if  d  then  c",  where  a,  b,  c,  and  d  are  in  R.  To  do  that,  we 
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need  to  model  the  statement  "if  b  then  a "  so  that  its  probability  is  P(a  |  b)  and  then  we 
must  define  the  connective  V  (or)  appropriately.  The  hope  is  that  S  will  again  be  of  the 
form  "if  e  then  /',  and  P(S)=P(e\f).  By  Lewis' triviality  result,  "if  b  then  a "  cannot 
be  an  element  of  R ,  so  we  are  led  to  look  outside  R  for  a  model.  (See  Chapter  2.)  In 
Chapter  3,  conditional  connectives  are  derived  from  algebraic  considerations,  and  in 
particular,  the  connective  V  is  derived  under  reasonable  assumptions  to  be 

(a  |  b)  V  (c  |  d)  -  {{ab  V  cd)  |  {ab  V  cd  V  bd)), 

where  ab  means  the  intersection  or  conjunction  of  a  and  b,  and  a  V  b  means  their 
union  or  disjunction.  This  operator  corresponds  to  Lukasiewicz's  three-valued  truth  table 
for  disjunction. 

Another  important  issue  is  that  of  non-monotonic  probabilistic  reasoning  in 
intelligent  systems.  Its  framework  is  as  follows.  Let  T  =  <  K,E  >,  where  IT  is  a 
knowledge  base  and  £  is  a  set  of  evidence.  K  consists  of  a  collection  of  implicative 
propositions  symbolized  as  (flij&i),  i  =  1,  2,  3,  ...»  n.  For  example,  in  the  "penguin 
triangle"  example  (Pearl,  1988),  these  are  defaults.  Note  that  in  the  Bayesian  approach, 
where  the  uncertainty  in  these  defaults  is  taken  into  account  in  a  more  quantitative  way,  a 
default  rule  of  the  form  "most  a' s  are  b" s"  is  modeled  semantically  as  P{a  j  b)  is  "high". 
£  is  a  collection  of  factual  propositions  (evidence).  Since  elements  in  £  can  be  viewed 
as  implicative  statements  which  are  implied  by  the  tautology  T  (true),  the  reasoning 
process  will  involve  a  logical  entailment  relation  $  in  a  conditional  logic.  Conditionals 
of  interest  are  of  the  form  (c|£),  where  £  stands  for  the  Boolean  conjunction  of  all 
elements  in  £,  and  c  is  some  event  of  interest.  It  is  desired  to  know  whether  (c|£) 
follows  logically  from  K.  In  the  case  of  the  penguin  triangle  example,  the  e-semantics  of 
Adams  can  be  used  (Pearl,  1988,  Ch.  10).  It  is  necessary  to  be  able  to  handle  production 
rules  in  expert  systems  rather  than  just  defaults  in  plausible  reasoning,  and  also  to  treat  the 
problem  at  a  syntactic  level  as  in  the  case  of  classical  first-order  logic,  where  $  is  simply 
the  order  relation  <  in  a  Boolean  ring.  Still  this  must  be  done  compatible  with 
conditional  probability  evaluations.  The  main  problem  is  the  representation  of  K  as  a 
whole.  Putting  all  (uncertain)  information  in  K  together  can  be  done  in  two  different 
ways:  internal  and  external.  If  implicative  propositions  (a,  |  £>,)  can  be  represented  as 
legitimate  quantities,  as  we  do  in  this  book,  and  if  logical  operations  among  them  are 
available,  then  an  internal  combination  of  information  in  K  consists  simply  as  taking 
conjunctions  of  all  the  (aij^i).  An  external  combination  strategy  would  consist  of 
forming  a  "product"  of  the  (a\\bi).  (See  Chapter  3  for  details.)  To  complete  the 
reasoning  procedure,  a  logical  entailment  relation  $  in  conditional  logic  needs  to  be 
supplied.  It  turns  out  that  the  order  structure  of  Boolean  rings  can  be  extended  suitably  to 
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provide  the  desired  $  .  Moreover,  relative  to  E,  that  is,  to  additional  facts  or  evidence,  $ 
is  non-monotonic.  (See  Chapter  8.) 

In  summary,  we  first  justify  the  coset  form  for  measure-free  conditional  events  by 
using  an  axiomatic  approach.  A  systematic  investigation  of  logical  operators  among 
conditionals,  including  those  with  different  antecedents,  is  then  carried  out,  resulting  in  a 
space  of  conditional  events.  Realizing  that  conditionals  have  three  possible  truth  values,  a 
systematic  study  of  three-valued  logics  leads  to  the  conclusion  that  systems  of  logical 
operators  among  conditionals  correspond  precisely  to  various  systems  of  three-valued 
logics.  A  conditional  probability  logic  is  formulated,  extending  classical  probability  logic. 
In  a  direction  of  generalization,  we  devote  a  chapter  for  conditioning  in  a  fuzzy  setting. 

0.4  Overview  of  the  book 

In  view  of  the  state-of-the-art  presented  above,  we  have  looked  again  at  the  problem 
in  the  last  several  years  (Goodman,  1987,  Goodman  and  Nguyen,  1988).  The  present 
book  is  based  essentially  on  our  earlier  unpublished  work  "A  Theory  of  Measure-Free 
Conditioning"  (1987).  Some  of  the  results  have  already  appeared  in  print,  and  are  here 
augmented  by  new  and  improved  procedures.  In  our  view,  it  is  not  too  early  to  provide  a 
comprehensive  presentation  of  the  theory  of  measure-free  conditioning.  It  is  our  hope  that 
this  book  will  stimulate  further  basic  research  in  this  area. 

The  basic  program  consists  of  nine  parts: 

(1)  Formulation  of  the  conditional  event  problem  (Chapter  0). 

(2)  Extensive  literature  review  pointing  up  the  lack  of  a  systematic  investigation  of 
the  problem  (Chapter  1). 

(3)  Derivation  of  the  necessary  form  that  a  conditional  event  must  take,  namely  that 
of  a  coset  of  a  principal  ideal  in  a  Boolean  algebra  of  events  (Chapter  2). 

(4)  Derivation  of  the  appropriate  operations  on  conditional  events  and  development 
of  the  calculus  of  these  operations  and  the  partial  order  extending  the  usual  subset 
relation  of  ordinary  events,  together  with  a  justification  of  the  proposed  conditional 
logic  via  three  valued  logic  (Chapter  3). 

(5)  Establishment  of  relevant  algebraic  properties  and  a  characterization  of  the 
algebra  of  conditional  events,  and  an  extension  of  the  Stone  Representation  Theorem 
to  this  conditional  setting  (Chapter  4). 

(6)  An  analysis  of  the  assignment  of  conditional  probability  to  conditional  events 
(Chapter  5). 
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(7)  The  development  of  conditional  probability  logic  whose  algebraic  structure  is 
the  conditional  event  algebra  (Chapter  6). 

(8)  The  generalization  of  results  to  fuzzy  events  (Chapter  7). 

(9)  The  investigation  of  iterated  conditioning,  and  miscellaneous  issues  (Chapter  8). 
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CHAPTER  1 

A  SURVEY  OF  PREVIOUS  WORK  ON  CONDITIONAL  EVENTS 

As  stated  before,  investigations  of  conditional  events  and  their  operations  have  a 
tong  history,  but  are  not  well  known  among  probabilists,  logicians,  and  computer 
scientists,  who  make  up  a  good  deal  of  the  AI  community.  Our  literature  search  revealed 
that  the  topic  has  been  considered,  independently  and  at  infrequent  intervals  of  time,  by 
logicians  and  mathematicians,  dating  back  to  an  idea  in  Boole’s  book  (1854).  Below  we 
present  the  main  approaches  that  have  been  taken,  as  well  as  duplications  of  effort  that 
have  been  made.  As  we  will  see,  throughout  the  development  of  the  mathematical  theory 
of  conditional  events  and  their  calculus,  there  has  been  a  proliferation  of  definitions  of 
these  objects  and  of  operations  among  them.  This  is  due  to  the  fact  that  each  approach 
has  been  based  upon  some  intuitive  idea  or  some  analogy,  rather  than  a  systematic 
analysis  from  a  first  principles  or  axiomatic  approach.  An  axiomatic  approach  -  which  we 
take  here  -  should  not  only  justify  rigorously  the  correct  forms  for  conditional  events,  but 
should  also  shed  light  Oi.  the  ones  investigated  so  far.  In  the  same  vein,  a  reasonable 
conditional  logic  should  be  able  to  be  defended  axiomatically.  See  Chapters  2  and  3. 

1.1  Implicative  Boolean  algebras  and  Lewis'  triviality  result 

The  first  approach  considered  for  modeling  conditional  events  is  that  of  Cope « .ad 
(1941,  1945,  1950).  See  also  Copeland  and  Harary  (1953a,  1953b),  Balbes  (1970),  and 
Jonsson  (1954). 

Let  R  be  a  Boolean  ring  and  let  -*  be  material  implication,  that  is,  b  -*  a  is  the 
element  b'  V[a.  If  P  is  a  probability  measure  on  R,  then  in  general 

P(b  *  a)  *  P{a\b).  (1) 

The  simple-appearing  expression  in  (1)  belies  an  interesting  and  significant  history.  First, 
it  can  be  improved  as  follows. 

P(l  -  a)  =  P(b'  V  a)  =  P{b'  V  ab ) 


=  P{b')  +  P(ab)  =  P(b')  +  P(a\b)P(b) 
=  PQ>')  +  P(a\b)[l-P(b')} 
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=  P(a\b)  +  P(b')[l-P(a\b)] 

=  P(a\b)  +  P(b')P(a'  |2>)  ZP(a\b),  (2) 

with  equality  holding  if  and  only  if  P(a'b)  =  0  or  P(b)  =  2,  a  rather  trivial  case. 

Popper  (1963,  p.  390,  formula  22)  was  among  the  first  to  recognize  a  form  of  (2) 
although  earlier  in  1956,  Copeland  (p.  42)  implicitly  used  the  inequality  as  a  springboard 
for  his  implicative  Boolean  algebra  work.  Calabrese  independently  recognized  (2)  in  1975 
and  later  in  1987  (p.  201),  motivating  his  development  of  conditional  events  outside  of 
the  Boolean  algebra  of  unconditional  events  R.  It  is  tempting  to  seek  another  operation  0 
on  R  such  that  P(aOb)  =  P(a\b),  that  is,  a  binary  operation  0  on  R  such  that  P(a0b )  = 
P(a\b)  is  well  defined.  In  other  words,  one  would  like  to  know  whether  "conditional 
events"  can  be  modeled  as  ordinary  events,  that  is,  as  elements  of  R.  It  turns  out  that, 
except  for  trivial  cases,  the  answer  is  negative  (Lewis,  1976;  Adams,  1975).  Later 
Calabrese  (1987),  unaware  of  Le  is'  so  called  "triviality  result",  showed,  using  the  normal 
disjunctive  form  of  Boolean  polynomials,  that  such  a  0  could  not  be  Boolean,  that  is, 
expressible  in  terms  of  union,  intersection,  and  complement.  Copeland  proceeded  direcdy 
to  the  search  for  such  a  0  ,  and  consequently  only  obtained  trivial  cases.  We  now  discuss 
Lewis'  Triviality  Result,  and  then  outline  Copeland’s  work  on  implicative  Boolean 
algebras. 

Theorem  1  ( Lewis'  Triviality  Result).  Let  R  be  a  Boolean  ring  with  more  than  four 
elements.  Then  there  is  no  binary  operation  0  on  R  such  that  for  all  probability 
measures  P  on  R,  and  all  a,  b  in  R  with  P(jb)  >  0, 

P(a0b )  =  P(a\b). 

Proof.  Suppose  0  exists.  For  a  probability  measure  P  on  R  and  an  element 
reR  with  P(r)  *  0,  denote  by  ?r  be  the  probability  measure  on  R  given  by 
Pr(x)  =  P(pc)/P(r).  Now,  if  a  and  b  are  in  R  and  P(ab)  *0  &P(a'b),  then  a  and  b 
are  P-independent.  Indeed,  since  P(a),  P(a')t  and  P(b)  are  all  positive,  we  have 

P(a\b)  =  P(aOb) 

=  P((a0b)a)  +  P((a0b)a') 

=  P((a0b)\a)P(a)  +  P((a0b)\a')P(a') 

=  Pa(aOb)P(.a)  +  Pa,(aOb)P(a') 

=  Pa(a\b)P(a)  +  Pa,(a\V)P{a') 
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=  (Pa(ab)/Pa(b))P(a)  +  (Pa,(ab)IPa,(b))P(a') 

=  P(a)  +  0 
=  P(fl). 

Since  R  has  more  than  four  elements,  then  we  can  find  a  'and  b  in  R  such  that 
ab*0*a'b.  Indeed,  let  b  be  in  R,  b  *  1,  and  let  aeJ?  with  a*b'.  If  ab  =  0,  then 
a  <  b'  and  ab  *0  *  a'b'.  If  a b*  0,  then  ab  *  0  *  a'b  else  b  <  a.  In  any  case,  we 
have  elements  a  and  b  of  R  with  ab#0*a'b  and  b*l.  By  Stone’s  Representation 
Theorem,  R  is  a  subalgebra  of  the  algebra  &(C1)  of  all  subsets  of  some  set  CL  Let  x,  y 
and  z  be  elements  of  Cl  such  that  jc  e  ab,ye  a'b  and  zt  b.  Let  P  be  a  probability 
measure  on  &(C1)  with  P({x})  =  P({y))  =  P([z))  =  1/3.  Then  P  is  a  probability 
measure  on  R  such  that  P(ab)  *  0  *  P(a'b).  Now  P(a\b)  =  i/2  while  P(a)  =  2/J  or 
1/3,  depending  on  whether  or  not  zb  a.  Thus  P{a  j  b)  *  P(a),  and  0  cannot  exist  o 

The  proof  above  is  based  on  Lewis'  original  proof  (Lewis,  1976).  If  R  has  four  or 
fewer  elements,  then  the  reader  may  verify  easily  the  existence  of  a  0  satisfying  the 
condition  in  the  theorem. 

As  a  simple  example  when  such  a  0  does  not  exist,  let  f2  =  {x,  y,  z}  and  R  =  f? 
(£>).  Define  P  by  P(x)  =  P(y)  =  P{z)  =  1/3.  Let  a  =  {x}  and  b={x,y}.  Then 
P(ab)  *  0*P(a'b),  and  P(a\b)  =  2/5,  while  P(a)  =  1/3.  Of  course,  this  is  essentially  the 
construction  at  the  end  of  the  proof  just  given. 

When  R  is  finite,  there  is  another  approach  to  Lewis’  Triviality  Result.  If  an 
operation  0  on  R  existed  satisfying  P(aOb)  =  P(a\b),  then  P{a | b)  can  have  no  more 
than  tt(R)  values,  where  #(/?)  denotes  the  number  of  elements  of  R.  This  is  simply 
because  aOb  is  an  element  of  R.  Thus,  to  prove  Lewis’  Triviality  Result,  it  suffices  to 
construct  on  R  a  probability  measure  P  such  that  P(a\b)  takes  more  than  #(/?) 
values.  We  will  do  a  bit  more. 

Theorem  2.  Let  O.  be  a  finite  set,  and  let  R  be  the  Boolean  algebra  of  all  subsets  of  Cl. 
If  Cl  has  n  elements,  n>0,  and  P  is  any  probability  measure  on  R,  then  there  are  no 
mo-e  than  5n-2n+1+5  possible  values  for  P(a\b).  Further,  then  there  is  a  probability 
measure  P  on  R  such  that  Pip  |  b)  takes  on  3n  -  2rH'1  +  5  distinct  values. 

Proof.  Since  P(p\b)  =  P(ab)/P{b),  to  get  the  number  of  possible  values  of  P(a | b) 
not  0  or  i,  we  simply  have  to  count  the  number  of  pairs  {ajb)  in  R,  that  is,  the  number 
of  pairs  {a,  b)  of  subsets  of  Q,  with  0  <a<b.  But  this  is  the  number 
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X  2"  (2m  -  2)(£)  =  (3"  -  7)  -2(2"  -  7)  =  5"  -  2»*-i  + 7, 


and  the  first  part  of  the  theorem  follows. 

If  /i  is  any  bounded  measure  on  R,  then  P  =  p/p{G)  is  a  probability  measure  on 
R,  and  P(a  |  b)  =  P(ab)/P(b)  =  fi(ab)fp(b).  We  will  prescribe  a  bounded  measure  p.  on 
R  such  that  distinct  pairs  (ajb)  of  elements  with  0<a<b  give  distinct  p(a)/ji(b).  A 
measure  is  prescribed  on  R  by  assigning  to  each  of  its  singletons,  that  is  to  each  element 
of  fi,a  positive  number.  We  will  have  the  desired  measure,  then  if  there  are  n  positive 

numbers  cq,  cq,‘ _ _  such  that  if  7,  J,  AT,  and  L  are  subsets  of  { 7, 2, . . . ,  n}  with 

<j>  c  7  c  7  and  (J)  c  AT  c  L,  is  the  sum  of  those  cq  with  i  e  7,  and  (7,  J)  *  (K,  L),  then 
o^/Oj  *  a  ja^  This  construction  is  relagated  to  the  following  lemma.  a 


Lemma  1.  Let  n>  0.  Then  there  exist  positive  numbers  cq,  cq, _ ,  such  that  if  7, 

7,  K ,  and  L  are  subsets  of  {7,2,...,  n]  with  (j >  c  7  c  7  and  (j)  c  AT  c  L,  is  the  sum 
of  those  cq  with  i  e  7,  am/  (7, 7)  *  (AT,'L),  rfe/i  aja}  *  c^/c^. 

Proof.  We  get  the  desired  o/s  inductively.  Let  cq  be  any  positive  number  greater 
than  7.  (There  is  some  convenience  in  having  oq  >  7.)  Having  chosen  oq,  aq,  .  . .  , 
cq_!,  for  7  <  i  £  n,  let  cq  be  a  positive  number  with  such  that 


cq  >  (oq  +  + . . .  +  £q-i)2. 

For  example,  if  oq  is  taken  to  be  2,  then  cq  may  be  taken  to  be  greater  than  (2)2  =  4,. 
If  taken  to  be  J,  say  then  oq  then  may  be  taken  to  be  greater  than  (cq  +  cq)2  =  (2  +  5)1 
=  49,  and  so  on.  Note  that  cq  <  cq  <  .  • .  <  cqj.  Now  suppose  that  aja }  =  o^/o^,  with 
the  7,  /,  AT,  and  L  having  the  properties  noted  above.  Let  m  be  the  largest  index  such 
that  m  is  in  one  of  the  sets  7,  7,  K,  and  L,  and  is  not  in  all  four.  There  is  such  an  m 

because  (7,  7)  *  (K,  L).  Let  s  =  >  ^cq,  which  is  0 ,  of  course,  if  m  =  n.  There  are 

really  only  three  distinct  cases  to  consider: 

(a)  m  is  in  7, 7,  and  L,  and  is  not  in  K; 

(b)  m  is  7  and  L,  and  is  not  in  7  or  AT; 

(c)  m  is  in  7  and  not  in  7  or  AT  or  L. 

Case  (a)  is  the  case  that  m  is  in  exactly  three  of  the  sets;  case  (b)  is  the  case  that  m  is  in 
exactly  two  of  the  sets,  and  case  (c)  is  the  case  that  m  is  in  exactly  one  of  the  sets. 
Write  0^  =  u  +  cqn  +  s  or  u  +  s,  depending  on  whether  or  not  me/.  Then  cCj  =  u  +  v  + 
+  s,  since  m  e  7  in  all  three  cases.  The  number  u  is  just  X  cq,  where  /  e  7  and 

i  <  m.  This  number  may  be  0.  Now  v  =  X  cq,  where  i  e  J,i  <  m ,  and  i  e  /.  Similarly, 

crK  =  x  +  sy  since  we  never  have  to  consider  the  case  where  m  e  AT,  and  finally, 
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aL  =  x  +  y+  +  s  or  x  +  y  +  s  depending  on  whether  or  not  meL.  An  important 
point  is  that  u,  v,  x,  and  y  are  sums  of  cq’s  for  some  fs  <  m,  so  are  small  relative  to 
c^n  and  to  s,  if  s  >  0. 

Suppose  that  we  are  in  case  (a),  and  that  ccja^aja^.  This  gives 
(u  +  Ot n+  s)/(u  +  v  +  am+s)  =  ipc  +  s)/(pc  +  y+am+s) 


whence 


(u  +  am+s)(x+y  +  am+s)  =  (u  +  v  +  asa+s)(x  +  s). 
This  equality  yields 

s(am-v+y)  +  uy  +  uam+yaa+(an)2=xv 


This  is  impossible.  All  the  terms  on  the  left  are  >  0,  and  {cc^p-  >  xv  since 

am>(.al  +  az  +  ...  +  a^.ip-txv. 

In  case  (b),  we  have  the  equality 

(u  +  s)/(u  +  v  +  am+s)  =  (x  +  s)/(x+y  +  o^+s), 

which  yields 


s(y-v)  =x(am+v)-u(am+y). 


The  left  side  s(y  -  v)  must  be  0.  Otherwise, 

s*05Hy-vl>i, 


and 


IxCoin  +  v)  -  u(c^ + y)l  <  (a!  +  a2  + . . .  +  og2  <  c^.!  <  sly  -  vl. 
But  if  s(y-v)  =  0,  then 


ajpc-u)  =  uy-xv, 

whence  x-u,  from  which  it  follows  that  y  =  v.  But  this  means  that  I  =  K  and  J  =  L. 
which  is  not  the  case. 

In  case  (3),  we  have  the  equality 


(u  +  s)/(u  +  v  +  ccn  +  j)  =  (;c  +  s)/(x  +  y  +  s). 
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which  yields 


s{ocm+v-y)  =  uy-vx-xcur 

If  s  >0,  the  left  side  is  positive,  and  the  right  side  is  negative  unless  x  =  0.  In  this  case 
siOjn+v-y)  >  uy  since  s  >  (oq  +  Ofy  +  -  •  -  +  cQ2  >uy.  If  s  =  0,  then  xa m=  uy  -  vx, 
sox  -  0.  But  then  K  =  $ ,  which  is  not  allowed-  -  o 

There  are  a  couple  of  comments  that  should  be  made  about  getting  the  desired  oq 
in  the  proof  of  Theorem  2.  First,  it  is  clear  that  they  can  be  chosen  to  be  integers,  in 
which  case  the  resulting  probability  measure  will  have  all  rational  values.  Secondly,  that 
such  cq  exist  is  no  problem.  Taking  the  cq  to  be  algebraically  independent  (over  the 
rational  numbers)  will  get  the  desired  distinct  P(a\b).  Being  algebraically  means  tht 
there  is  no  non-trivial  polynomial  /  with  rational  coefficients  such  that  /(cq,  cq, .  . . , 
cQ  =  0.  If  ctj/ctj  =  ctyjct^  for  some  (IJ)  ±  ( KJL ),  then  is  a  non  trivial 

polynomials  in  the  cq's,  and  is  0.  That  such  algebraically  independent  sets  exist  is  a  well 
known  algebraic  fact.  A  good  reference  is  Hungeiford  (1974,  page  311). 

An  immediate  corollary  of  Theorem  2  is  Lewis'  Triviality  Result  for  those  R  that 
are  the  algebra  of  all  subsets  of  a  set  with  at  least  3  elements.  Indeed,  in  that  case 

3*-2°*+3>2n 


and  since  there  is  a  probability  measure  P  on  R  such  that  P(a\b)  takes  3n  -  2**1  +  3 
distinct  values,  there  is  no  binary  operation  0  on  R  such  that  P(a0b)  =  P(a\b)  for  all  a 
and  b  in  R. 

Now  suppose  that  R  is  any  Boolean  algebra  with  q  elements,  a  >4.  Then  R  is  a 
subalgebra  of  the  algebra  5*  (£2)  of  all  subsets  of  a  finite  set  £2  with  at  least  three 
elements.  As  we  have  seen,  there  is  a  probability  measure  P  on  ^(£2)  taking  distinct 
values  P(a\b)  not  0  or  1  for  every  pair  (afi)  with  0<a<b.  The  restriction  of  P 
to  R  is  a  probability  measure  on  R ,  and  taking  b  =  1  yields  q  distinct  values  for 
P(a\b),  counting  0  and  i.  As  in  the  proof  of  Lewis'  Triviality  Result  above,  there  are 
elements  a  and  b  in  R  with  ab^O^  a'b,  and  b  ^  1.  Thus  P(ab\b)  is  yet  another 
value,  yielding  more  than  q  values  of  P(a\b)  for  elements  a  and  b  of  R.  This 
implies  Lewis’  Triviality  Result  for  any  finite  Boolean  algebra  with  more  than  four 
elements.  It  is  not  clear  how  our  theorem  can  be  used  to  prove  Lewis’  Triviality  Result  for 
infinite  R. 

Tnc  theorem  also  shows  that  if  R  is  the  algebra  of  all  subsets  of  a  set  with  n 
elements,  then  any  model  5  of  the  space  of  conditional  events  that  is  compatible  with 
probability  must  have  at  least  3n  -  2n+i  +  3  elements,  and  if  R  is  any  finite  Boolean 
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algebra  such  a  model  must  be  larger  than  R. 

The  key  fact  in  all  the  above  is  the  existence  of  a  probability  measure-  P  on  the 
Boolean  algebra  of  all  subsets  of  a  finite  set  CL  The  existence  of  such  a  P  is  a 

trivial  consequence  of  the  existence  of  algebraically  independent  real  numbers,  as  we  have 
noted.  To  actually  construct  a  measure,  as  we  did  in  the  lemma,  yielding  the  desived 
probability  involved  a  bit  of  arithmetic,  but  an  elementary  prescription  was  given  for  the 
numbers  needed. 

Following  is  an  alternate  proof  of  the  existence  of  such  a  probability  measure  on 
these  special  finite  Boolean  algebras.  It  may  have  some  independent  interest  We  are 
going  to  show  that  if  R  =  &(&),  with  Q  finite,  then  there  is  a  PQ  on  R  such  that  PQ 
is  one-to-one  on  the  set 


[(a,  b) :  0  *  a  <  b,  a,  b  £  U), 


that  is,  whenever  Q^a^  <bj,  Q^a2<b2  ^  ^ap  *  ^a2’  ty*  ^en  we  ^ave 

PJflj  \bj)  *  P  J<a2 1  We  will  carry  out  the  proof  of  this  by  induction  on  #(fl). 

Specifically,  we  are  going  to  show  that  for  each  n  £  1,  there  is  a  probability  measure  Pn 
on  Rn  =  ^(Qn),  where  Cln  =  [cop  ...,  m^},  such  that 

#[Pn(a\b) :  a,  be  Rn>  PJb)  >  0)  =  5n -2n+i  +  3. 


For  n=  1,  Clj  -  {co^},  and  Rj  =  {0,  (Oj).  Let  Pj  -  5^  ,  that  is,  Fj(0)  =  0, 
1.  We  have 

#{Pj(a\b) :  a  =  0,  (o};  b  =  C0j]  =2  =  3  -  2?  +  3  . 


For  n  =  2,  Cl2  =  {C0j,  co2),  ^  &]  ~  {^>  ©2*  ^P  Ct)2^'  ^enot^riS  35  usual 

the  Dirac  mass  point  probability  at  <a  by  5,„  let  P0  =  (1/5)5.,  +(2/5)5  .  We  have 

V  At  l Oj 


[{a,b)  :Q*a<b)  =  {{cop  [cOj,  g>2}),  (<a2,  {(Op  g»2))} 


F2(0  [  coj)  =  F2(0 1 0)2)  =  P2ico1 1  ©2)  =  ?2(©2 1  (Dj) 

=  F2(0| {(Oj,  co2))  =  0; 

P2{o)1\a>i)  -P2((02\(02)  =  F2({cui,  co2)\{(Ov  co2})  =  1 ; 

co2 })  =  l/3*2/3  =  P2(Q)2l{co1,  tu2)) . 


and 
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?Ja\b):a,beR,b*B}  =  4  =  3? -2s  +  3. 


Note  that  Pp  2^  are  both  one-to-one  on  Rp  2^.  respectively,  and  take  rational  values  in 

10,  H 

Assume  that  up  to  n,  there  is  Pn :  Rn  -*  [0,  2]  of  the  form 
where  the  rjs  are  distinct  positive  integers. 


such  that 


S  =En  rit 
«  j=i  J 


#{Pn(a\b )  :a,beRn,b*0}=3n  -  2n+1  +  5. 


Consider  £2  .j  =  {©j.  •••>  0>n,  b)n+]}-  define 

Pn+l('COj)  =  rj '  sn+l’  J  =  n  +  1;  sn+l  =  rj  ’ 
and  rn+j  is  to  be  determined.  Let  0*aj  <  bp  <  b2  ^ 

al>  a2>  bl>  b2  e  Rn+P  (flpb1)*(a2,b2). 


®n+l  ~  u  U  blb? 

If  a>n+j  e  bjbp  then  (On+j  t  a.,  b^  i  =  1, 2;  and 

?n+7(flil*i>  =  Pn+J^IPn+l{bi)  =  (  1  r/)/(  1  ri> 

n+1  i  i  n+1  i  n+1  i  [j:o.ea.}  J  {j'.CQ-eb-)  J 

J  J 

Thus,  by  hypothesis  of  induction, 


when 


Pn+l^bl)*Pn+l^b2> 


%+; £  b'ib2  • 


for  any  choice  of  integer  rn+j  different  from  the  rjs,j  -  1, 2,  n. 
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Consider  next  the  case  where  G>n+j  £  bjU  A  partition  of  bj  u  ^  is 

[a ^2,  alb2’  ^ lb 1*^2’  b1^2*  b * 

To  express  the  fact  that  ®n+j  e  ti.  or  u>n+j  t  a.,  we  write 

ai  =  0i,ouAu'Vi 

for  A.  j  =  or  0,  respectively.  Similarly 

bi  =  bifi'jAlXmn+V 


Define 


ai,01  bi,0  e  ^n* 


*-{ 


>*Aij  =  an+l 


,J  I  0  if  A, ,  =  0 


We  have 


cc-  —  X  /"jl  , 

ft-  =  X  r, . 


Pn+i(<1ili’i)  =  Fin-l<-a?,Pn+l(bi> 
_lPn+l(ai,0>  +  \lFn+lf(0n+l>I 


=  t“i +  V«iwi +  V.+I1- 


pn+i(ail6j)  “ 


if  and  only  if  satisfies 


^n+i  +  Brn+7  +  C  =  0 


where 


A  “  *1,1*22  *12*2,1 

B  =  *22al  +  *U^2  '  *12a2  ‘  *2,1^1 
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C  -  afi2  '  • 

Our  proof  will  be  complete  if  for  each  pair  (flj,  bj),  (a2,  b^)  the  coefficients  A,  B, 
C  cannot  be  all  zero,  since  then  either  A  *  0  or  B  *  0,  and  hence  the  quadratic  equation 
(**)  will  have  at  most  two  solutions.  Let  J  j  denote  the  set  of  all  such  solutions  for  all 
possible  pairs  (dj,  bj),  (a2>  b2).  Obviously,  Jn+j  is  finite.  It  suffices  to  choose  rn+j 
to  be  a  positive  integer  not  in  Jn+j  and  different  from  all  the  rjs,  j  =  i,  2, ....  n. 

The  last  point  to  show  is: 

A  =  B  -  C  =  0  is  impossible. 

For  this  purpose,  suppose 

A  =  B  =  C  =  0  . 


Under  this  assumption,  and  in  view  of  the  hypotheses  concerning  a.,  b-,  i  =  1,  2,  we  note 
that: 

(i)  /3.  >  0,  i  —  1, 2. 

(ii)  If  cij  =  0  then  a2  =  0  and  conversely. 

In  this  case,  we  have 

2  =  hih 

since 

B  =  0 . 


But  Tj j,  x 2j  >  0,  since  otherwise,  aj  =  a2  =  0. 


P j  =  P2-  But  this  will  imply  that  bj  =  b2  and  aj 


Thus,  x jj  =  x2j  =  1  and  hence 
=  a2>  contradicting  the  hypotheses. 


Hence: 


0  <  p.,  i  =  1, 2. 

(iii)  In  fact,  0  <  a.  <  P-,  i  =  1,  2.  But  this  is  equivalent  to 

0<%<V  i=I’2- 

Now,  C  =  0  applied  to  this  case  implies 

al^l  =  “2^2 

which  is  equivalent  to  Pn(a]0\bj0)  =  Pl^a2o^2oi'  ^nc^uct^on  hypothesis, 
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®<alo  =  a2o  <  blo  ~  b2o  ’ 

that  is 

0<otj  =  a 2<fy  =  P2 

which  implies,  using  B  =  0, 

aj  =  ^2  and 

a  contradiction  again. 

Thus,  in  summary,  A  =  B  =  C  =  0  cannot  hold  □ 

Here  is  an  alternate  proof  of  Lewis'  Triviality  Result  in  the  general  case.  For  a 
Boolean  ring  R  a  probability  measure  P  on  R,  and  a  mapping  f :  R  x  R  R,  define 

9?  =  {Pb  :  Pb ('}  =  ^  &  e  R>  P(b)  >  ’ 

=  [(a,  b ) :  a,b  e  R,  P(b)  >  0,  P(a\b )  =  0  or  i}  , 

tfy-p  =  {(d  £) :  a,  b  e  R,  P(b)  >0 ,  for  allot  R  such  that 

P(bc)  >  0  and  Pc\Ka,b)]  =  Pc{a\b)}, 
and 


STp  =  {(a,  6) :  a,  b  e  /?,  P(b)  >  0,  (a,  b)  t  Up,  and  0  <  P(ab )  =  P(a)P(b)  <  1 )  . 


Then 


Indeed,  let  (a,  b)  s  # Then  0<F(a|b)<i.  For  c  =  i,  P(bc)  =  P(b)  >  0,  and 
we  have 


Also, 

and 


P{f{a,  b) ]  =  P(a  j  b). 
Pam,  b )]  =  Pa{a\b)  =  1, 


Parm.  b )]  =  Pa,(a\b)  =  0 , 

whence 


F(o|b)  =  Ff/(o,  &)]  =  P{af[a,  b )}  +  P[a' -f[a,  b)]  =  P(a) , 


that  is,  P(ab)  =  P(a)P(b),  which  means  that  (a,  b)  s  3 
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As  a  consequence,  if  R  is  a  Boolean  ring  having  at  least  two  elements  a,  b  such 

that  0  <  a  <  b  <  1,  then  there  is  no  map  f:P?-*R  compatible  with  conditional 
probability.  Indeed,  if  such  a  map  /  exists,  then  choose  P  to  be  a  probability  measure 
on  R  such  that  0  <  P(a )  <  P(b)  <  1.  We  have 

(a,  b)  t  U p,(a,b)e 
But 

0  <  P{dfP{b)  <  P(a)  =  P(ab )  <  1 , 

and  (a,  b )  £  S"p,  a  contradiction.  Thus  no  such  map  /  exists,  and  Lewis'  Triviality 
Result  follows.  □ 

We  turn  now  to  Copeland's  work.  He  looked  for  a  mathematical  operator 
representing  the  logical  connective  "if’  in  R ,  analogous  to  division.  That  is,  he  apparently 
had  in  mind  modeling  (a  |  b)  with  the  "fraction"  ajb  in  R.  With  this  he  introduced  the 
following  notion. 

Definition.  An  implicative  Boolean  ring  is  a  Boolean  ring  R  together  with  an  additional 
binary  operation  x  satisfying  the  following,  for  elements  a,  b,  c  in  R. 

(0  ax(bxc)  =  ( ccxb)xc , 

(ii)  ax(b  +  c)  =  axb  +  cue, 

(iif)  ax(bc)  =  ( axb)(axc ), 

(iv)  if  axb  =  axe  and  a*0,  then  b  =  c, 

(v)  axl  =  a,  and 

( vi )  for  every  a  and  b  with  b*0,  there  is  an  element  c  such  that  ab  =  bxc. 

Axioms  (iv)  and  (vi)  enable  one  to  define  an  operator  0  by  ab  =  bx(b0a),  for  b*0. 
For  b  *  0,  the  mapping  Rb-*R  :rb-*  b0(rb)  is  one-to-one  and  onto.  Indeed, 
b0(rb)  =  b0(sb )  implies  that 

bx(b<)(rb))  =  bx(b0(sb))  =  rbb  =  rb  =  sbb  =  sb, 

so  that  the  mapping  is  one-to-one.  Since 

x(xxy)  -  (xxl)(xxy)  =  xx(l  -y)  =  xxy,  xxy  <  x. 

Thus  bxr  is  an  element  of  Rb  and  bxr  -*  b0(bxr)  =  r  since  bxr  =  (bxr)b  =  bx(b0(bxr)) 
Thus  the  mapping  is  onto.  This  shows  that  every  implicative  Boolean  ring  *  (0)  is 
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infinite,  since  R  and  Rb  are  in  one-to-one  correspondence  and  Rb  is  smaller  than  R  for 
b  *  1  and  for/?  finite.  The  mapping  above  is  actually  an  isomorphism,  so  R  is  isomorphic 
to  Rb  for  every  b  *  0.  This  severely  limits  the  usefulness  of  implicative  Boolean  rings. 
Also,  there  do  not  seem  to  be  any  appealing  examples  of  them. 

By  a  "probability*',  Copeland  meant  any  probability  measure  P  on  R  such  that 

P(axb)  =  P{a)P{b). 

This  makes  0  compatible  with  such  probabilities,  since 

P(a\b)  =  P(ab)/P(b)  =  P(fcx(&0a»/P(&)  =  P(b)P(bOa)/P(b)  =  P(bOa). 

If  all  probabilities  on  R  satisfied  this  condition,  then  Lewis'  Triviality  Result  would  imply 
that  there  were  no  implicative  Boolean  rings  except  0.  Thus  given  a  non-trivial 
implicative  Boolean  ring,  only  some  probability  measures  on  it  are  allowable,  and 
Copeland  does  not  elaborate  on  that  point. 

Since  the  intention  of  Copeland  was  to  stay  in  the  ring  R,  the  problem  of 
"conditional  logic",  that  is,  considering  operations  between  conditional  events,  did  not 
arise.  In  any  case,  this  approach  through  implicative  Boolean  rings  seems  futile.  As 
Pfanzagl  (1971'  Chapter  12)  has  pointed  out,  if  (a\b)  e  R,  then  c  A  (a | b)  admits  no 
semantic  interpretation. 

1.2  Division  of  events 

Although  Copeland  did  mention  that  his  operator  "if  is  somewhat  analogous  to 
division,  he  did  not  elaborate  further  on  this  connection.  It  turns  out  that  in  Boole’s  basic 
work  (Boole,  1854),  the  problem  of  interpretation  of  division  of  propositions  was 
considered  in  some  detail.  However,  since  all  elements  except  /  in  a  Boolean  ring  are 
zero  divisors,  there  are  bound  to  be  some  difficulties  with  this  approach.  We  now  outline 
the  idea  of  Boole  and  the  follow-up  work  of  Hailperin. 

Boole’s  division  interpretation 

In  his  basic  work  (Boole,  1854)  which  laid  down  the  foundation  of  symbolic  logic, 
Boole  explained  vaguely  an  interpretation  for  division  in  a  Boolean  ring.  For  elements  a 
and  b  of  R,  the  element  alb  is  defined  to  be  an  element  of  R  such  that  {ajb)b  =  a.  Now 
such  an  element  exists  only  if  a  <  b.  This  difficulty  can  be  circumvented  by  requiring  that 
alb  =  ab/b.  But  then,  instead  of  trying  to  solve  the  equation  ( ab/b)b  =  ab  for  ab/b,  which 
has  many  solutions,  indeed  the  whole  coset  a  +  Rb',  which  of  course  is  not  an  element  of 
R,  Boole  proceeded  differently.  Writing  down  the  normal  disjunctive  form  of  a  binary 
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Boolean  function  as 

f(a,b)  =  WJ)ab]  V  mDa'b)  V  [f(0,0)a'b']  V  W.0WI 
he  took  formally  f(a,b)  =  alb ,  leading  to  the  expansion 

alb  =  abV  ( 0/0)a'b '  =  a  V  (0/0)6',  _ 

since  1/1  =  i,  0/2  =  0,  and  i/0  is  not  defined  so  that  ab'  has  to  be  0,  that,  is  a£b.  It 
remains  to  interpret  the  indefinite  "quantity"  0/0.  Here  is  Boole's  description  of  0/0: 
"The  symbol  0/0  indicates  that  a  perfectly  indefinite  portion  of  the  class,  that  is,  "some", 
"none",  or  "all  of  its  members  are  to  be  taken"  (Boole,  page  92).  Translating  Boole,  aNxb' 
is  a  candidate  for  ajb,  for  a<.b  and  for  any  .t  in  R ,  and  of  course  a  V  xb'  is  precisely 
the  coset  a  +  Rb\  As  in  the  case  of  Copeland,  no  attempt  was  made  concerning  logical 
operations  among  these  "algebraic  fractions". 

Jevons  (1879)  objected  to  Boole’s  division  on  the  grounds  that  it  lacked  clarity. 
Peirce  (1867)  retained  Boole's  operation  of  division  and  embellished  it  MacFarlane 
(1879)  produced  a  very  readable  and  improved  version  of  Boole’s  idea.  Unfortunately, 
this  work  did  not  enter  the  main  body  of  logic.  In  effect,  the  vacuum  created  by  the  lack 
of  division  in  Boole-Schroeder  logic  was  filled  by  the  introduction  of  other  operations 
within  logic,  such  as  material  implication. 

Rigorization  of  Boole’s  technique 

Kailperin  (1976)  analyzed  thoroughly  Boole's  original  work  -  especially  his  long 
forgotten  concepts  of  logical  division  and  fractions  of  events.  In  fact,  Hailperin  came  to 
the  conclusion  that  Boole's  division  is  viable,  provided  sufficient  rigor  is  used  in 
developing  the  idea.  This  was  accomplished  by  forming  a  Chevalley-Uzkov  "ring  of 
quotients"  corresponding  to  "divide"  by  an  event  b  (Uzkov,  1949).  Indeed,  since  all  the 
elements  of  a  Boolean  ring  R  are  zero  divisors  (except  1),  the  standard  approach  to  rings 
of  quotients  is  not  applicable.  A  way  around  this  situation  were  given  by  Uzkov  (1949) 
for  commutative  rings  with  unity.  If  R  is  such  a  ring,  then  the  construction  of  a  ring  of 
quotients  for  R  is  as  follows.  Let  5  be  a  multiplicatively  closed  subset  of  R  not 
containing  0.  That  is,  S  c  R  and  xy  e  S  whenever  x,  y  e  S.  An  equivalence  relation  is 
defined  on  R  x  S  by 


(r,  s)  =  (r,  u) 


if  and  only  if  there  is  an  element  re  S  with 
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x(st  -  ru)  =  0 . 

The  set  of  equivalence  classes  rls  of  «  is  a  commutative  ring  with  identity  under 
the  operations  given  by 


and 


rls  +  t/u  =  (ru  +  st)lsu 
(rls)(t/u)  =  rt/su. 


When  R  is  Boolean,  then  any  element  b  e  R  with  b*0  is  a  candidate  for  S 
above.  Thus  the  ring  can  be  formed,  and  it  is  easy  to  see  that 

-*Rb  :  alb  -+  ab 

is  an  isomorphism,  and  of  course 

Rb-*R\Rb'  : ab-*a  +  Rb' 

is  an  isomorphism  as  well.  Thus  Hailperin  is  led  to  the  association  of  a  conditional  event 
(a  | b)  with  the  coset  a  +  Rb'.  Hailperin  actually  took  S  to  be  Rv  b,  the  principal  filter 
associated  with  b,  but  this  leads  to  the  same  ring  of  quotient,  and  hence  both  to  R  \Rb'. 
Calabrese  (1987),  without  having  in  mind  conditional  events  as  "quotients"  in  a  Boolean 
ring,  proposed  an  equivalent  definition,  and  hence  one  equivalent  to  the  coset  form.  In 
any  event,  for  Hailperin,  "fractional  events"  became  cosets  of  principal  ideals.  In  any 
case,  it  allowed  Hailperin  to  justify  Boole's  notion  of  fractional  events.  He  went  on  to 
consider  these  "fractional  events"  or  "conditional  events",  as  the  set  of  all  cosets  of 
principal  ideals.  In  ring  theory,  it  is  not  customary  to  define  operations  among  cosets  of 
different  quotient  rings.  Because  of  this  fact,  Hailperin  (p.  112-113)  refers  simply  to  the 
collection  of  all  conditional  events  as  a  partial  algebra,  that  is,  the  operations  +  and  • 
can  be  only  defined  on  each  quotient  ring,  but  not  between  two  cosets  from  two  different 
quotient  rings.  This  is  somehow  surprising  since  it  is  precisely  this  point  which  is 
important  for  a  logic  of  conditional  propositions.  It  is  here  that  a  good  motivation  for  a 
new  problem  in  ring  theory  arises.  The  problem  is  this:  What  are  the  operations  of 
interest  on  the  union  of  all  quotient  rings  of  R  extending  those  on  each  fixed  one?  This 
question  is  the  topic  of  Chapter  3.  Another  point  is  that  in  his  discussion  concerning  truth 
tables,  Hailperin  (p.  127)  did  realize  that  conditional  events  have  three  possible  truth 
values.  This  fact  was  realized  much  earlier  by  DeFinetti  (1964)  and  Schay  (1968)  who 
defined  conditional  events  precisely  this  way,  that  is  from  a  semantic  viewpoint  (or 
equivalently,  by  extending  the  concept  of  ordinary  indicator  functions  of  events  or  sets). 
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Not  only  is  this  approach  to  conditional  events  through  a  three-valued  logic  equivalent  to 
the  coset  form  of  conditional  events  (see  Chapter  2),  but  it  sheds  light  on  how  to  define 
logical  operations  among  conditional  events,  addressing  the  problem  in  ring  theory 
mentioned  above.  Indeed,  it  is  well-known  in  classical  two-valued  logic  that  if 

/:  {0,  if  -  {0, 1), 

then  there  is  a  unique  logical  operation 

<pf:Rn-*R 

such  that 

t[(pXah  ...,  OjJ]  =Mai), ....  K^hd), 

where  a\  e  R,  i=  1, 2, ...»  n  and  t  stands  for  "truth  value  of."  (See,  for  example, 
Hamilton,  1978).  It  turns  out  that  this  result  remains  valid  in  a  three-valued  logic,  as  we 
show  in  Chapter  3.  Thus,  not  only  will  each  system  of  operations  on  conditionals  have  a 
logical  interpretation,  but  more  importantly,  the  above  extension  problem  in  Boolean  ring 
theory  is  solvable  in  view  of  existing  systems  of  three-valued  logics,  for  example  those  of 
Lukasiewicz,  Sobocinski,  Kleene,  and  Bochvar  (see  Rescher,  1969,  and  our  Chapter  3). 

To  complete  a  survey  of  Hailperin's  work,  it  should  be  also  mentioned  that  in 
making  "Boole's  probability  rigorous, "  Hailperin  (footnote  on  p.  191)  took  the  probability 
of  a  conditional  event,  that  is,  of  a  coset,  to  be  a  conditional  probability.  This  is  indeed 
well-defined,  and  is  precisely  a  "compatibility  condition"  with  probability  leading  to  an 
axiomatic  theory  of  conditional  events  (see  Chapter  2).  In  the  same  vein,  Hailperin 
(p.  195-197)  proceeded  to  consider  the  concept  of  a  "conditional  events  probability  realm." 
This  is  somehow  shailar  to  Renyi's  (1970)  approach  to  conditional  probability  spaces,  but 
in  which  there  is  a  home  for  (a\b)  in  P(a  \  b).  See  also  our  Chapter  5.  However,  since 
the  space  of  conditional  events  was  not  investigated  far  enough  to  reach  a  reasonable 
algebraic  structure  (mainly  due  to  the  lack  of  operations  amongst  conditional  events),  no 
new  concepts  were  introduced  beyond  that  In  Chapter  4,  we  will  show  that  the  space  of 
conditional  events  is  a  Stone  algebra,  generalizing  Boolean  algebras.  In  other  words,  the 
"partial  algebra"  of  Hailperin  is  in  fact  an  algebra  with  Lukasiewicz’s  three-valued  logic 
interpretation. 

In  light  of  the  attempt  to  use  the  theoiy  of  "rings  of  quotients",  the  following  is 
pertinent,  and  should  lay  these  attempts  to  rest.  The  only  quotients  that  make  sense  in  a 
Boolean  ring  are  those  alb  where  a  <  b.  By  alb,  we  mean  an  element  in  R  for  which 
(a/b)b  =  a.  Indeed,  if  ( a/b)b  =  a,  then  multiplying  through  by  b  gives  ab  =  a,  whence 
a  <  b.  If  a  <  b,  then  taking  alb  =  a  gives  an  element  whose  product  with  b  is  a. 
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Further,  the  ring  R  cannot  be  enlarged  to  another  ring  so  that  alb  is  defined  for  a  not  <,  b. 
Indeed,  b\alb)b  =  b' a  =  0,  and  this  is  not  the  case  unless  a  £  b.  So  trying  to  enlarge  R 
so  that  division  of  events  is  possible  in  that  enlargement  is  futile.  Any  divisions  by 
elements  of  R  that  can  be  carried  out  in  a  larger  ring  can  already  be  carried  out  in  R. 
More  general  statements  are  true.  A  "ring  of  quotients"  of  a  ring  R  is  a  ring  S  and  a 
homomorphism  f:R-*S  satisfying  certain  properties.  If  R  is  Boolean,  then  so  is  the 
subring  f(R)  of  5.  Suppose  one  wanted  to  make  an  element  binR  into  a  unit  in  S,  that  is, 
wanted  S  to  be  sych  that /(b)  could  be  divided  into  everything  in  S  (or  even  in  f(R)).  Then 
in /(/?),  \f{l)lf(b)]Kb)  =  /(i),  whence,  multiplying  through  by /(b)  as  above,  we  g et/(i)  = 
/(b).  But /(i)  =  7  in  S,/ being  a  homomorphism,  so /(b)  =  1.  So  the  only  way  to  make  an 
element  of  R  into  a  unit  is  to  make  it  into  the  identity  element  If  ajb  is  to  make  sense  for 
every  element  a  (or  even  just  for  a  =  i),  that  is,  if  b  is  to  be  a  unit  then  the  setting  must 
be  such  that  b  =  1.  What  Hailperin  did,  in  effect  was  to  go  to  the  ring  RIRb\  or 
equivalently,  Rb  where  indeed  b  is  the  identity;  b  +  R/Rb'  =  1  +  R/Rb\  and  b  certainly  is 
the  identity  of  Rb. 

Hailperin  used  a  special  Chevalley-Uzkov  ring  of  quotients.  There  are  many  "rings 
of  quotients"  in  ring  theory,  two  others  being  Johnson-Utumi  ring  of  quotients  and  the 
"classical  ring  of  fractions."  (See  for  example,  Lambek,  1966,  for  background).  This  can 
be  seen  as  follows.  We  describe  these  two  briefly  for  commutative  rings. 

Let  R  be  a  commutative  ring  with  unity  1.  An  ideal  /  of  R  is  said  to  be  dense 
if,  for  all  r  e  R,  rl  =  0  implies  r  =  0.  A  fraction  is  a  (module)  homomorphism  h  :  I  -*R 

with  domain  I  being  a  dense  ideal.  That  is,  if  x,ye  I,  then 


and  if  xe  I,  re  R,  then 


hQc  +  y)  =  h(x)  +  h(y ), 


h(rx)  =  rh(x). 


Let  Hom(I,  R)  be  the  class  of  fractions  with  domain  /,  and  let  F(R)  be  the  union  of 
then  Hom(I,R)  over  all  dense  ideals  /.  For  /  e  Hom(I,  R)  and  g  e  Hom{J,R),  let  f  =>  g 
if  /  =  g  on  /  n  J.  Tnen  =  is  an  equivalence  relation  on  F(R).  It  is  obviously  reflexive 
and  symmetric.  Transitivity  is  less  obvious.  For  that,  first  note  that  the  intersection  of 
dense  ideals  is  dense.  Indeed,  for  I  and  /  dense,  r(I  nJ)  =  0  imples  that  rIJ  =  0,  so 
rl  =  0,  whence  r  =  0.  Now  let  feHom(J,R),geHom{J,R)  and  heHom{K,R )  with 
f~g  and  g~h.  Let  xe  I c\K.  For  yeJr,Jc\K, 


so 


Ax)y  =Axy)  =  g(xy)  =  h(xy)  =  h(x)y, 
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and  so 


(f{x)-Kx))y  =  0, 
fix)  =  h{x). 


Thus  *  is  also  transitive. 

With  appropriate  operations  (see  for  example,  Lambek,  1966),  the  set  Q(R)  of 
equivalence  classes  of  a  ring,  called  the  Johnson-Utumi  complete  ring  of  fractions  of  R. 
Since  Hom{R,  R)~*R  :f-*fil)  is  a  ring  isomorphism,  Q(R)  contains  a  copy  of  R.  For  a 
non-zero  divisor  reR.Rr  is  dense,  and  the  HomiRr,  R)  for  such  r  give  rise  to  a 
subring  CUJR.)  of  Q(R)  called  the  classical  ring  of  quotient  of  R.  An  element 
f  g  HomiRr,  R)  is  identified  with  the  "fraction"  xlr,  where  fir)  =  x. 

When  R  is  a  Boolean  ring,  the  only  non-zero  divisor  is  /,  thus  the  only  dense 
principal  ideal  is  R  itself,  and  CL{R)  =  Hom(R,  R)  is  just  R  itself.  This  is  the  case  if 
R  is  finite,  for  example. 

If  i?  is  the  ring  of  all  subsets  of  a  set  £2,  then  an  ideal  /  is  dense  if  and  only  if 

x  =  V  i  =  1,  since  otherwise  x'  *  0  and  x'l  =  0.  In  this  case,  Q{R)  is  also  just  R 
16 / 

itself.  To  see  this,  for  /  dense,  /e  Hom(I,R)  and  j  e  /, 

fij)  =fij)-l  =fij)i  vo= 

16/  ie/  ie/ 

so  /  is  just  multiplication  by  V  fit)  =  x.  This  /  is  equivalent  to  he  Hom{R,  R)  given 

is/ 

by  h{r)  =  rx,  and  Hom(R,  R)  is  isomorphic  to  R  as  indicated  above.  More  generally, 

viewing  R  as  a  subalgebra  of  2^  for  some  set  £2,  call  R  complete  if  R  is  closed 

under  arbitrary  unions.  Then,  as  above,  /  is  dense  if  and  only  if  V  =  I,  and  Q{R) ®  R. 

16/ 

It  turns  out  that  Q{R)  ^R  if  and  only  if  R  is  complete  (Lambek,  1966).  There  exist 
non-complete  Boolean  rings,  so  in  general  Q(R)  properly  contains  R. 

1.3  Three-valued  logic 

Independently  of  each  other,  Reichenbach  (1948,  1949),  Schay  (1968),  DeFinetti 
(1972,  1974,  Volumes  1  and  2),  and  Dubois  and  Prade  (1987,  1990)  considered  the 
modeling  of  conditional  events  from  a  logical  viewpoint  They  all  viewed  a  conditional 
event  as  an  object  with  three  possible  "truth"  values. 

First,  Reichenbach  considered  probability  as  being  determined  completely  through 
all  standard  logical  operations  over  Boolean  algebras  of  propositions  and  quantified 
expressions,  as  well  as  through  the  adjunction  of  a  distinct  "probability  implication" 
operator  P  corresponding  to  P(  |  •)•  He  also  developed  a  calculus  of  probabilities  (1949, 
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I, 


(i 


t": 

u 


i; 


Chapter  3),  and  a  related  probability  logic  (1949,  Chapter  10).  Probabilistic  conditioning 
and  logical  implication  were  compared  in  two  places:  (i)  in  the  discussions  of  basic 
axioms  for  probability  (1949,  pages  54-57),  in  which  probabilistic  conditioning  is  argued 
to  be  a  natural,  but  because  of  the  zero-probability  antecedent  case,  tacitly  modified 
extension  of  logical  implication,  and  (ii)  in  the  use  of  P  as  a  "quasi-implication"  (1948, 
table  4b,  page  151;  table  5,  page  168;  and  pages  166-168).  Reichenbach  proposed  that,  in 
place  of  classical  logical  implication,  quasi-implication  relative  to  three-valued  logic 
( 0  =  false,  1  =  true,  and  /  =  indeterminate)  was  a  more  suitable  operation.  Informally,  for 
am  {0,1,1}, 

0%  a  and  1 £  a 


are  defined  semantically  as  P(a|0)  =  I  and  P(a\l)  =  a,  respectively.  Reichenbach  also 
briefly  considered  an  equivalent  form  of  the  concept  of  measure-free  conditionals  through 
his  "indeterminate"  probability  implication  operator  b  -»  a  via  symbolic  logic  as 
(3 P){b  a).  (See  Reichenbach,  1948,  pages,  51,  52, 71, 72). 

Schay  (1968)  asked  "could  (a\b)  be  defined  in  a  manner  consistent  with  general 
usage  in  probability  theory,  that  is,  so  that  P(ab)IP(b)  may  be  interpreted  as  the 
probability  of  {a\b)T  Schay  proposed  to  define  (a\b)  as  a  generalized  indicator  function 
on  ft  (here  R  is  a  Boolean  ring  of  subsets  of  ft)  where 


W(©)  = 


I  if  cue  ab 
0  ii  0)e  a'b 
u  (undefined)  if  0)efc '. 


Note  that  such  functions  are  clearly  in  one-to-one  correspondence  with  elements  of  Rb,  or 
with  elements  of  the  quotient  ring  R/Rb’  since  such  functions  specify  the  subsets  b  and  ab, 
and  conversely. 

In  discussing  conditional  probabilities,  DeFinetti  made  a  remark  about  P{a\b), 
saying  that  one  can  even  talk  about  the  probability  of  the  "conditional  event"  (a\b) 
(DeFinetti,  1974,  page  134).  He  specifies  this  mathematical  object  on  page  139  of  that 
reference,  as  a  "tri-event",  corresponding  precisely  to  Schay’s  notion.  DeFinetti  also 
considered  interpreting  conditional  events  through  a  coset  representation  (1974,  Vol.  1, 
pages  267-269),  but  apparently  did  not  connect  this  with  Mazurkiewicz  (1956)  and  others’ 
ideas  on  the  same  subject.  (See  Section  1.4  below.)  Furthermore,  DeFinetti  even 
considered  briefly  how  one  could  obtain  a  "logical  sum"  of  such  conditional  events  (1974, 
Vol  2,  page  310)  as  sell  as  how  double  conditional  events  could  be  interpreted  (1974,  Vol 
2,  pages  327-328).  In  a  related  vein,  DeFinetti  broached  the  issue  of  "countcrfactuals"  and 
verifications  relative  to  conditional  events,  and  concluded  that  compatibility  constraints 


32 


A  Survey  of  previous  work  on  conditional  events 


were  key  to  any  further  analysis  of  operations  on  such  entities.  But,  other  than  brief 
comments  on  the  potentiality  of  how  a  calculus  of  operations  among  conditional  events 
could  be  developed  -and  was  needed  -  no  actual  work  in  this  direction  was  executed. 

Bruno  and  Gilio  (1985),  inspired  by  DeFInetti's  much  earlier  work,  proposed  an 
abbreviated  algebra  of  measure-free  conditional  events  to  produce  "conditional 
hyper-probabilities",  which,  in  turn,  were  used  to  obtain  some  new  factorization  results  for 
Scozzafava’s  pseudodensities  (1984).  However,  they  did  not  attach  any  direct 
interpretation  to  conditional  events  such  as  being  cosets  of  principal  ideals,  as  presented 
in  our  work  here.  It  turns  out  that  their  operations  are  identical  to  certain  of  those 
proposed  by  Schay  (1968)  and  Calabrese  (1987).  (See  also  Section  1.5.)  Based  on 
DeFinetti’s  work,  but  independent  of  Bruno  and  Gilio,  Barigelli  and  Scozzafav  .  (1984) 
mentioned  the  lack  of  apparent  attention  paid  to  the  domain  of  conditional  probability 
operators,  that  is  ,  to  conditional  events.  Their  thesis  was  that  careful  consideration  of 
such  could  lead  to  improved  interpretations  of  frequency  data  and  the  elimination  of 
certain  confounding  problems. 

In  discussing  reasoning  with  uncertain  information,  Dubois  and  Prade  (1987,  first 
edition  1985),  were  led  to  consider  a  symbol  like  (•  |  •)  for  a  "non-traditional"  logical 
connective.  (It  should  not  be  confused  with  Sheffefs  "binary  rejection"  (fl\b),  which  is 
defined  as  a'b').  A  truth  value  table  for  (a|b)  is  established  by  observing  that  the  truth 
values  Ka\b)  of  (a\b)  are  solutions  of  the  equation  tipb)  =  Min{t(a\b),  t(b)}  and  so  are 
given  by  t(a\b)  =  J,  0  ,  or  {0,1}  according  as  tiflb)  =  1  or  tifl'b)  =  I,  or  t(b)  =  0.  Here, 
{0,1}  is  referred  to  as  an  "indeterminate".  See  also  Dubois  and  Prade  (1989, 1990). 

Some  additional  efforts  related  to  conditional  events  are  these:  Cox  (1961) 
established  an  algebra  of  conditioning  for  a  fixed  common  antecedent  by  formally 
omitting  the  probability  operator  everywhere  it  appears  in  a  conditional  or  unconditional 
form.  It  appears  that  Cox  implicitly  recognized  the  need  to  establish  measure-free 
conditional  events,  but  did  not  continue  the  development.  The  closest  he  came  to  the 
point  of  introducing  conditional  events  is  in  Chapter  1.3,  corresponding  to  the  measure-free 
chaining  forms.  (See  Chapter  3  in  this  book.) 

Popper  (1961,  Appendices  IV*  and  V*)  developed  a  postulate  system  for  probability 
which  included  conditional  probability  perceived  as  a  numerical  operator  on  ordered  pairs 
of  primitives  subject  to  certain  algebraic  relations  within  the  probability  arguments. 
Furthermore,,  he  defined  unconditional  event;  within  the  probabilities  through 
probabilistic  equivalence,  but  unfortunately  did  not  attempt  to  cany  out  a  similar 
procedure  for  the  ordered  pairs  within  the  probability  operator,  corresponding  to 
conditional  events. 

It  should  be  mentioned  also  the  very  general  work  of  Foulis  and  Randall  (1971, 
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1974)  concerning  the  development  of  measure-free  conditioning  maps.  Future  efforts  may 
uncover  useful  connections  between  that  work  and  ours. 

1.4  Coset  form  of  conditional  events 

In  the  attempt  to  define  rigorously  the  concept  of  conditional  events,  compatible 
with  probability  theory,  except  for  Copeland,  all  previous  researchers  came  across,  in  one 
form  or  another,  cosets  of  principal  ideals  in  a  Boolean  ring.  This  section  is  devoted  to  a 
survey  of  more  apparent  work  related  to  the  coset  form  of  conditional  events.  In  historic 
order,  we  survey  the  work  of  Koopman  (1940),  Mazurkiewicz  (1956),  Domotor  (1969), 
Pfanzagl  (1971),  Hailperin  (1986),  and  Calabrese  (1987).  Again,  it  should  be  noted  that 
all  these  efforts  were  carried  out  independently. 

Koopman's  program  was  an  investigation  into  "the  axioms  and  algebra  of  intuitive 
probability".  The  algebra  part,  that  is,  the  definitions  of  logical  operations  among 
conditional  events,  was  not  addressed!  The  idea  of  intuitive  or  qualitative  probability  is 
well-known:  the  primal  intuition  probability  expresses  itself  in  a  partial  order  relation 
among  events.  Qualitative  (or  comparative)  probability  is  motivated  by  the  desire  to  make 
numerical  probability  measures  compatible  with  non-numerical  probability  comparisons. 
See,  for  example,  Fine  (1973),  Fishbum  (1983),  Villegas  (1967),  Domotor  (1969),  and 
Suppes  (1973).  Since  information  is  basically  conditional,  "conditional  events"  should  be 
the  basic  building  blocks  rather  than  unconditional  ones.  While  Rdnyi  (1970)  took  this 
viewpoint,  from  a  numerical  approach,  extending  Kolmogorov's  model  (see  Chapter  5), 
Koopman  first  proceeded  from  a  measure-free  attack.  For  a  and  b  in  R,  an  expression 
denoted  by  (a\b)  is  called  an  "eventuality".  In  a  footnote  (1940b,  page  270),  Koopman 
mentioned  that  the  notation  ( a\b )  is  used  in  a  manner  "close"  to  that  of  coset  a  +  Rb\ 
without  further  elaboration.  In  fact,  ( a\b )  is  used  as  an  "alternative  notation"  for  a  +  Rb'. 
From  a  qualitative  viewpoint,  the  basic  problem  is  that  of  comparison  in  probability  of 
eventualities.  That  is  to  say,  a  system  of  axioms  for  a  partial  order  relation  among 
conditionals  should  first  be  given.  He  focuses  attention  on  the  set  R\R  =  v^RIRb  of  all 
cosets  of  all  principal  ideals,  and  postulates  the  existence  of  "n-scales".  From  this,  upper 
and  lower  numerical  probabilities  of  conditional  events  were  shown  to  exist.  There  is 
some  analogy  here  in  the  work  of  Dempster  (1967)  and  of  Shafer  (1976).  Conditional 
probabilities  themselves  were  next  defined  as  common  values  of  upper  and  lower 
probabilities  (when  they  happen  to  be  the  same).  On  conditional  events  with  the  same 
antecedent,  this  conditional  probability  reduces  to  an  ordinary  probability  measure.  It 
seems  at  this  point  that  Koopman  wanted  simply  to  show  that  the  numerical  probability 
that  he  introduced  generalized  the  Kolmogorov  model.  He  made  no  attempt  to  introduce 
operations  between  conditional  events  with  different  antecedents  and  to  consider  the 
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behavior  of  his  new  probability  with  respect  to  such  operations.  Instead,  he  gave  a  system 
of  axioms  for  orderings  among  "eventualities,"  that  is  conditionals  with  different 
antecedents.  See  Section  3.5  for  more  discussion  on  Koopman's  system  of  axioms  for 
intuitive  probability. 

The  major  contribution  of  Mazurkiewicz  (1956,  Chapter  HI)  were  to  identify 
conditional  events  as  cosets  of  principal  ideals  and  to  note  the  consistency  of  the 
assignment  of  probabilities  to  these  cosets.  That  is,  the  assignment  P(a  +  rb')  =  P(a \  b) 
is  well  defined, 'being  P(ab)IP(b),  so  that  the  assignment  P(a  +  Rb)  =  P(a\b)  gives  a 
probability  measure  on  the  quotient  ring  RIRb'  of  cosets  of  the  principal  ideal  Rb'. 

Domotor  (1969)  also  identified  conditionals  with  cosets  of  principal  ideals,  but  did 
not  introduce  a  partial,  order  on  it.  He  embedded  R/R  into  a  larger  algebraic  structure 
equipped  with  a  vector  space  structure,  identifying  R/R  with  DeFinetti’s  generalized 
indicator  functions.  On  this  vector  space,  probability  measures  are  viewed  as  linear 
functionals.  However,  except  for  vector  space  operations,  no  attempt  was  made  to 
consider  extensions  of  ordinary  boolean  operators. 

In  his  book  on  the  theory  of  measurement,  Pfanzagl  (1971)  presented  an  approach  to 
the  simultaneous  measurement  of  utility  and  subjective  probability  generalizing 
Morgenstem-Von  Neumann's  approach  (Von  Neumann  and  Morgenstem,  1947).  The 
syntactic  concept  of  conditional  events  is  essential  in  his  work.  Even  Pfanzagl  was  aware 
of  Copeland's  implicative  algebra  (1941),  (but  not  of  Koopman's  work  (1940)  in  which  the 
coset  form  was  proposed  for  conditionals!);  he  did  not  adopt  Copeland's  concept  of 
(measure-free)  conditional  events.  Instead,  he  proposed  the  coset  a  +  Rb'  for  {a | b). 
But  he  stayed  in  a  fixed  (Boolean)  quotient  ring  RJRb'  (that  is,  for  a  fixed  b  e  R),  so 
that  conditional  events  with  different  antecedents  were  not  investigated.  He  considered 
conditioning  in  R/Rb' ,  that  is,  iterated  conditional  events  of  the  form  ((a|h)|(c|h)),  for 
b  fixed,  and  this  was  defined  simply  as  a  coset  of  the  Boolean  ring  RIRb',  that  is, 

) 

((a\b)\(c\b))  =  (a\b)  +  mb')(.c\bY, 

where 

{c\b)'  =  (c'\b), 

the  "negation"  of  (c|&)  in  R/Rb'.  (All  operations  involved  are  coset  operations  on 
RIRb'.)  With  the  mathematical  concept  of  (measure- free)  conditional  events,  the  notion 
of  "compound  wagers"  (conditional  bets)  can  be  formulated.  For  each  fixed  b  6  R,  the 
Boolean  quotient  ring  R/Rb'  is  the  collection  of  all  conditional  events  (or  conditionals) 
with  the  same  antecedent  b.  The  conditional  (a\b)  t  RIRb'  can  be  used  in  the 
representation  of  "compound  wagers"  (conditional  bets)  (Pfanzagl,  1971,  p.  205-207)  as 
follows. 
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Following  Pfanzagl,  a  "wager"  is  a  situation  with  a  finite  number  of  possible 
"outcomes,"  exactly  one  of  which  is  to  occur.  Let  A  =  {a j, ....  a  }  be  the  set  of 
outcomes  for  a  wager  W.  Which  one  of  the  a-s  occurs  depends  on  some  uncertain 
event  as  R.  A  "simple  wager"  WQ  is  defined  to  be  a  wager  with  only  two  possible 
outcomes,  say  a,  j3.  Specifically,  wQ  =  a  if  a  occurs  and  /3  if  a'  occurs.  In  a  logical 
framework,  wQ  can  be  viewed  as  a  function  on  the  set  of  maximal  filters  £2  of  R  that 
is,  models  of  R.  (See  Chapter  6):  For  all  co  s  £2, 

/  a  if  a  s  to 

w  (co)  = 

a  l  j3  if  a'  s  a). 

A  compound  wager  is  defined  to  be  a  wager  with 


A-{ar  c^,  a3,  a4 }, 


depending  upon  (a  |  b),  i.e. 


•  Oj  if  ab  s  a 
O2  if  a'b  s  co 
a3  if  ab'  s  co 
a4  if  a'b'  s  co . 


For  more  detal,  see  Pfanzagl  (1971,  Chapter  12).  See  also  Neapolitan  (1990,  p.  57)  for 
conditional  bets. 

Independently  of  previous  work  on  the  subject,  Calabrese  (1987)  investigated  a 
conditioning  operator  in  logic  from  an  empirical  viewpoint.  His  approach  is  algebraic, 
and  is  based  on  a  relation  between  logical  deducts  (or  consequences)  and  filters  in  a 
Boolean  ring.  (See  Tarski,  1956.)  For  each  b  in  R,  the  set  of  deducts  of  b  is  the  filter 
R  V  b  =  {r  V  b  :  r  e  R).  This  is  precisely  the  coset  b  +  Rb\  Noting 


RSb={xeR:x^b)  =  {xeR:xb=l-b) 


and  replacing  1  by  any  a  in  R  leads  to  the  class  of  a-relative  deducts 


(R  V  b)Q  =  { XBR:xb  =  ab }. 

It  is  easy  to  verify  that  ( R  V  b)a  is  the  coset  a  +  Rb'.  A  further  critic  of  Calabrese's  work 
will  be  found  at  the  end  of  Section  1.5  below. 


1.5  Logical  operations  among  conditional  events 

Schay  (1968),  Bruno  and  Gilio  (1985)  and  Calabrese  (1987)  contain  developments 
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of  logical  operations  among  conditional  events.  (Although  Adams  (1975)  did  not  define 
conditional  events  mathematically,  he  did  propose  similar  logical  operations  among  them. 
See  Chapter  0.)  A  comparison  with  our  operations  will  be  given  in  Chapter  3. 

Schay  became  apparently  the  first  to  attempt  a  full  calculus  of  operations  among 
conditionals,  especially  among  those  with  different  antecedents.  Specifically,  Schay 
(1968,  page  335)  defined  five  operations  as  follows:  For  him  the  Boolean  ring  R  is 
explicitly  a  ring  of  subsets  of  some  set  £2,  and  for  a,  b  e  R,  (a  |  b)  is  a  generalized  indicator 
function,  as  defined  in  Section  1.3,  that  is  (a\b) :  Q.  { 0,u,l }  with  ( a\b )  =  0  on  a'b,  u  on 
b'  and  1  on  ab.  Although  not  referring  explicitly  to  DeFinetti's  idea  of  conditional  event 
indicator  functions,  Schay  did  actually  use  this  idea  to  help  introduce  his  definitions  for 
logical  operations. 

Definition  (Schay).  For  a,  b,  c,  and  d  in  R 
(a\b)'=  (< a'\b ); 

(alb)  n  (c\d)  =  (ac\bd); 

(a\b)  u  ( c\d)  =  (aby  cd\b  V  d)\ 

(a\b)  A  (c\d)  =  (( b '  V  a)(d'  V  c)\b  V  d); 

(ct|h)  V  (c\d)  =  (a  V  c\bd)-. 

The  operations  (',  n,  V),  as  well  as  (',  A,  u)  satisfy  DeMorgan’s  Laws,  as  is  easily 
verified.  Further,  when  d  =  b,  u  =  V  and  n  =  A,  and  the  operations  reduce  to  the  usual 
set  operations  on  the  first  component  of  (a\b),  leaving  the  antecedent  fixed.  Since 
(a\b)  =  (ab\b),  one  can  restrict  a£b,  with  no  loss  of  generality.  Then  '  becomes 
(a\b)' =(a'b\b). 

Schay  did  investigate  the  algebraic  structure  of  the  space  of  conditionals  which  turns 
out  to  be  equivalent  to  the  set  R|/?  of  all  cosets  of  all  principal  ideals  (see  Section  2.3). 
He  noted  that  it  is  a  lattice  with  respect  to  (  £  ,U/V)  and  with  respect  to  (<,V,A).  He  also 
provided  an  axiomatic  description  of  his  structure,  analogous  fo  Stone's  Representation 
Theorem  for  Boolean  algebras.  (See  especially  his  Theorem  5.) 

Bruno  and  Gilio  (1985),  inspired  by  DeFinetti's  work  (but  independent  of  Schay) 
and  motivated  by  some  problems  in  statistics,  also  developed  a  calculus  of  conditional 
events.  Here,  as  with  Schay,  (a\b)  is  identified  as  a  generalized  indicator  function  on  Q. 
The  disjunction  and  conjunction  operations  of  Bruno  and  Gilio  are  identical  to  Schay's  u 
and  a,  respectively,  and  their  negation  is  the  same  as  Schay's.  They  define  an  order 
relation  among  conditionals  by  (a\b)  <  (c\d)  if  a  <  c  and  b  £  d. 

Independently  of  Schay,  Adams  and  Bruno  and  Gilio,  Calabrese  used  empirical 
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guidelines  to  propose  an  algebra  for  these  objects.  His  operations  (I)',  V,  and  A  turn  out 
to  be  identical  to  Schay's  ',  u,  and  A  respectively.  Calabrese  also  investigated  an 
extension  of  Stone's  Representation  Theorem,  and  considered  briefly  higher  order 
conditioning.  (As  stated  earlier  unaware  of  Lewis’  Triviality  Result,  Calabrese  proved  that 
no  mathematical  form  for  conditional  events  which  is  compatible  with  probability,  can  be 
given  in  terms  of  a  Boolean  function  into  R.)  A  recent  discussion  of  Schay’s,  Calabrese’s, 
and  our  work  is  in  Dubois  and  Prade  (1989,  1990).  A  more  systematic  comparison  is 
given  in  Section  3.5. 

For  ease  of  reference,  we  describe  below  the  essentials  of  Calabrese’s  work 
(Calabrese,  1987). 

Starting  from  the  viewpoint  of  a  unified  algebraic  theory  of  logic  and  probability, 
Calabrese  argued  for  an  additional  operation  (•  [  •)  on  a  set  of  propositions  represented 
by  a  Boolean  algebra  R.  This  same  argument  has  been  advocated  much  earlier  by 
Copeland  (Copeland,  1941),  see  Section  1.1.  However,  unlike  Copeland,  Calabrese  came 
to  realize  that  a  home  for  conditionals  should  be  outside  of  R.  Although  works  such  as 
Adams  (1975),  Hailperin  (1976)  were  cited  in  the  references  of  his  paper,  apparently 
Calabrese  did  not  notice  a  certain  number  of  basic  facts,  namely  Lewis'  Triviality  Result 
(discussed  in  Section  1.8  of  Adams'  book),  Adams'  proposed  logical  operations  for 
conditional  events  (called  conditional  formulas)  (Adams,  p.  46-47),  and  the  (equivalent) 
coset  form  for  conditional  events  in  Hailperin’s  book.  As  such,  he  first  reproved  a  special 
case  of  Lewis'  Triviality  Result,  namely  that  conditional  events  cannot  be  represented  by 
binary  Boolean  operations  on  R  (Calabrese,  1987,  Theorem  2.2.1).  Calabrese’s  approach 
to  defining  conditional  events  was  based  upon  the  concept  of  filters  in  R.  Specifically, 
for  a,  b  e  R,  the  conditional  event  ( a\b )  is  taken  to  be  the  equivalence  class  of  elements 
of  R  with  respect  to  the  filter  R  V  b,  where  by  definition  a  ~  c  (under  I  =  R  V  b)  if 
and  only  if  ai  =  ci  for  some  i  e  I.  But  it  is  easy  to  see  that  the  equivalence  class  of  a 
under  I  is  precisely  the  coset  all'  where  I'  is  the  ideal  defined  by  7'  =  {z':ie/}. 
Indeed,  let  a[I ]  denote  [x :  x  e  R,  x  =  a  under  /} .  Observe  that  x  e  a[7]  if  and  only  if 
jc  =  ri'  V  ai  for  some  r  e  R  and  i  e  I.  But 

x  =  ri  V  ai  =  ri'  +  ai  =  ri'  +  a{l  +  /')  =  (a  +  r)i'  +  a, 

so  that  a[I]  =  all',  where  /'  is  an  ideal.  In  particular,  for  /  =  R  V  b  (the  filter 
generated  by  b),  we  have  /'  =  Rb',  and  hence  (a|b)  =  a[R  V  b]  =  a  +  Rb'  =  alRb' ,  a 
principal  coset. 

Calabrese  went  on  to  consider  logical  operations  among  conditional  events  by 
logical  considerations.  First,  arguing  that  the  statement  "if  p  then  (if  q  then  r)"  is  the 
same  as  "if  (p  and  q )  then  r,"  he  identified  ((r|q)|p)  with  (r\phq).  See  also 
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Section  8.1.  From  that,  as  an  axiom,  he  defined 

(WKpI*) 

to  be 

{r\q  h(p\s)). 

Also,  as  an  axiom,  the  disjunction  V  is  defined  by 

(p\q)  V  (sir)  =  ((p  A  q)  V  (s  A  r)|p  V  r) 

which  is  precisely  Adams'  "quasi-disjunction"  of  conditional  formulas  (Adams,  1975,  p. 
47). 

Of  course,  since  R\Rb'  is  a  Boolean  ring,  for  each  fixed  b,  the  negation  for  (a\b) 
should  be  negation  in  this  Boolean  ring,  that  is  to  say  (a\b)  ~  (a' \b).  The  conjunction  A 
is  derived  from  V  via  DeMorgan: 

(q\p)  A  (s|r)  =  ((q\p)'  V  (s|r)')' 

=  i(q'  |p)  V  (s'  |r))'  =  ((<?'  A p)  V  (s'  A  r)\p  V  r)' 

=  ((p'  V  q)  A  (r'  V  s)p  V  r) 

=  ((P*q)  A(r4s]p  Vr), 

where  4  denotes  material  implication.  Again,  this  is  Adams'  "quasi-conjunction" 
operation  (Adams,  1975,  p.  46). 

1.6  Notes 

Below  we  include  several  intuitive  or  naive  approaches  to  the  problem  of  combining 
conditional  events,  and  of  assigning  probabilities  to  them.  Although,  they  turn  out  not  to 
be  satisfactory,  either  theoretically  or  practically,  they  arc  presented  here  for  purpose  of 
completeness  because  of  their  apparent  wide-spread  use. 

Product  space  approach. 

First  consider  the  case  of  equal  antecedents,  and  consider  ( a  |  b )  as  primitives  in  our 
natural  language,  as  in  Adams  (1975).  If  *  :  R2  -*  R  is  a  Boolean  function,  then  for  a 
probability  measure  P  on  R,  assign  P((a|b)*(c|b))  =  P{a*c\b).  (There  is  a  problem 

A 

already  with  P  being  not  necessarily  well  defined.)  Now  for  (a{.|  £.)  where  the  are 
not  necessarily  the  same,  one  can  try  to  reduce  to  the  equal  antecedent  case,  by  getting  a 

"common  denominator".  One  possibility  for  a  common  denominator  is  b  =  ( bj ,  b2 )  in 
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_2  A  A  A  A 

ir.  Identify  with  dj  =  (flp  1),  a2  with  a2  =  (1,  a2)»  and  (fl.|6f)  with  (a.jfrp. 

Then  we  are  back  in  the  equal  antecedent  case,  but  operating  in  the  product  space  P2. 

Now  for  a  probability  measure  P  on  P,  one  needs  a  probability  measure  P  on  R2  such 
that 

P(Si|8)=P(fl|.|fc/),  i  =  7, 2. 

But  this  requires  for  example, 

P({avl)\{bvb2))  =  hajbjfylhbfo)  =  PipjbjilPibj). 

A 

Taking  P  to  be  the  product  measure  meets  this  requirement,  but  the  product  measure  is 
unsatisfactory  because  it  implies  the  independence  of  the  events  {a pi)  and  (l,a2)  in 

0  A  A  A  A 

k.  Another  possibility  is  to  require  that  P{b)  =  2,  and  find  such  P  with  P(a,l)  = 

A  A 

P(a | bj)  and  P(l/i)  =  P(a\b2).  Finding  such  P’s  with  given  marginals  is  an  extremely 
difficulty  task,  but  has  been  solved  in  the  case  Cl  =  K,  the  field  of  real  numbers.  (Sklar, 
1959,  1973).  Sklar’s  Theorem  says  that  if  H  is  an  n-dimensional  cumulative  distribution 
function  with  one-dimensional  marginal  distributions  Fj,  F2>  ,  F  ^  then  there  exists  an 
n-dimensional  copula  C  such  that  for  all  n-tuples  (Xj,  x2> ,  xj, 

H(Xj,  x2,  ...,  xn)  =  C(Ffij),  F2(x2),  ...,  Fn(xn)). 

Conversely,  if  F j,  F2> ...  ,  F  are  one-dimensional  CDFs  and  C  is  an  n-dimensional 
copula,  then  H  defined  above  is  an  n-dimensional  CDF  with  marginals  F^. 

Roughly  speaking,  for  n  =  2,  a  copula  is  a  joint  distribution  for  a  pair  of  random 
variables,  each  of  which  is  uniformly  distributed  on  [0,  1].  More  formally,  a  two 
dimensional  copula  is  a  mapping 

C :  [0,1]  x  [0,1]  [0,1] 

such  that 

(i)  for  all  x  e  [0,1],  C(x,0)  =  C(0jc)  =  0,  C(x,l)  =  C(lx)  =  x,  and 

(ii)  for  x.  and  y.  e  [0,7]  with  Xj  <  x2  and  y j  <  y2,  one  has 

C(.xryj)  -  C(xJty2)  -  Cfyjj)  +  C{x2,y2)  >  0. 

Note  that  C  is  continuous  and  non-decreasing  in  each  argument,  and  that 
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Max{x  +  y  - 1,0}  £  C(pc,y)  £  Min[xy]  for x,  y  e  [0,1]}. 


The  above  bounds  are  also  copulas,  termed  minimal  and  maximal .  copulas, 
respectively.  For  the  use  of  copulas  in  statistics,  see  Whitt  (1976),  Genest  and  MacKay 
(1986a,  1986b),  and  Marshall  and  Olkin  (1988).  For  the  problem  of  determining  joint 
densities  from  given  conditional  densities,  see  Arnold  and  Press  £1989). 


Returning  to  the  problem  at  hand,  it  turns  out  that  the  above  intuitive  approach  in 

A 

the  case  of  equal  antecedents  leads  to  a  very  restrictive  form  of  P.  Indeed,  consider  the 
case  (Q,  R)  =  (R,  3f,  where  3  is  the  Borel  onfield  of  the  reals  01  Consider  (fl^b), 
i  =  1, 2, ...,  n,  where  a.  =  (-»,  $.],  and  denote  by  FA  and  Fp  the  CDFs  associated  with 

i  i  p  r 

A  A 

P  and  P,  respectively.  Then  for  b  =  b  x  b  x ...  x  b, 

F*(sr  s2’  ■••’sn)  =  F(xi=laP 


A  m  A  A  h  A  .A 

i « 

=  Minliiin  P((-»,  sf  | b)  =  Minmin  F^b), 

where  each  marginal  CDF  of  FA  is  Fp(-\b). 

P  r 

Now,  when  combining  n  conditionals  as  above,  by  Sklar’s  theorem,  one  chooses  an 
n-dimensional  copula  C,  independently  of  the  forms  of  the  conditionals.  The  joint  CDF 

of  P  is  then  obtained  once  P  and  the  bis  are  specified.  The  above  form  of  a  maximal 
copula  in  the  case  of  equal  antecedents  contradicts  the  independent  choice  of  C. 


Combination  of  antecedents  approach 

One  way  to  combine  conditional  e/ents  (a\b)  and  (c\d)  is  via  ordinary  Boolean 
operations  on  both  components,  for  example, 

(a\b)  V  (c|d)  =  (a  V  c\b  V  d), 
and 

(a\b)  r>  (c\d)  =  ( ar\c\bnd ). 

The  problem  in  doing  this  is  that  the  first  is  not  well  defined  (for  example  when 
(a\b)  =  a  +  Rb'),  and  besides  violates 
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P[(nj&)V(cjd)]£P(a|&), 


while  the  second  violates 


PKa\b)n(c\dn<,P(a\b). 

In  fact,  in  the  latter  case,  for  0  <a  =  c-bd  with  P(bd)  <  P(b)  and  P(d),  one  has 

P«fi\b)n(c\d))  =  P((a\b)n(a\d)) 

=  P((ab\bd)  n  (ad\bdj)  =  P((bd\bd)  n  ( bd\bd )) 

=  P(bd\bd)  =  l , 


and  yet 


and 


0<P(a\b)=P(d\b)<a 


0<P{c\d)  =P(b\d)  <  L 

Thus  probabilities  do  not  behave  as  one  would  like  for  either  of  these  operations  on 
conditional  events.  However,  it  will  be  shown  later  that  there  is  a  way  to  combine 
conditional  events  that  extends  the  Boolean  operations  on  ordinary  events  and  so  that 
probabilities  do  behave  properly  with  respect  to  that  combination.  In  effect,  this 
approach  is  a  non-cartesian  product  common  denominator  one. 


Material  implication  approach 

We  end  this  section  with  some  additional  remarks  about  material  implication  and  its 
relation  to  conditional  events.  Material  implication  is  the  function  f :  RxR-*R  defined 
by  Ka,b )  =  b'  V  a,  also  written  b  -*  a.  Now  material  implication  satisfies  many  desirable 
properties,  including  the  following,  which  are  trivial  to  verify. 

(1)  f(afc)  =f{dbfi)  (consequent-antecedent  invariance); 

(2)  (a,b)  =  1  if  and  only  if  b  <  a  (tautology); 

(3)  f{a,b)b  =  ab  (modus  ponens); 

(4)  f(ac,b)  =  f{a,b)f{c,b),f(a  V  c,b)  =f{a,b)  V  f(c,b)  (homomorphisms); 

(5)  fia,bc)f(c,b)  =f(ac,b)  (chaining); 

(6)  f{a,l)  =  a; 

(7)  ffla,b),  c)  =  f(a,bc)  (iteration); 
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(8)  ftb'.a')  =fa,b)  (modus  toUens); 

(9)  f(amc,d)  a'b  V  c'd  V  bd); 

(10)  flab)  Vflc,d)  =flaV  cM; 

(11)  flab)  +flc,d)  =fl(a  +  c)bd),  abVcdVbdV  b'd'). 

Also  note  that  b  a  is  the  maximum  solution  to  the  "equation  xb  =  ab.  The 
function  /  is  not  one-to-one.  For  example,  for  any  s£b'  Y  a,  flsj  V  a'b)  =  fab)-  The 
basic  difficulty  of  material  implication  is  that  it  is  not  compatible  with  probability,  that  is, 
P(a  |  b)  *  P(b  -t  a)  for  all  P  for  which  P(b)  *  0.  We  have  seen  this  before,  and  of 
course  follows  from  Lewis'  Triviality  Result.  In  fact,  as  shown  in  Section  1.1, 

P(b-*a)ZP(a\b). 

Again,  see  the  discussion  in  Sections  0.1  and  1.1. 
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CHAPTER  2 

DERIVATION  OF  CONDITIONAL  EVENTS 

This  chapter  is  devoted  to  an  axiomatic  approach  to  deriving  the  mathematical 
concept  of  conditional  events.  From  intuitive  properties  capturing  the  basic  aspects  of 
conditioning  and  the  requirement  that  conditioning  be  compatible  with  probability,  we 
proceed  to  derive  conditioning  operators  in  logic,  the  values  of  which  are  conditional 
events.  A  canonical  form  for  conditional  events,  namely  cosets  of  principal  ideals,  is 
obtained.  The  space  of  all  conditional  events  so  obtained  forms  the  basis  of  our  extension 
of  logic  to  the  conditional  case. 

2.1  Generalities 

In  view  of  Stone's  Representation  Theorem  (see,  for  example,  Halmos,  1963)  and  in 
the  spirit  of  symbolic  (and  algebraic)  logic,  the  basic  objects  of  our  analysis  are  the 
elements  of  a  Boolean  ring  ( R ,  +,  • ).  A  Boolean  ring  is  a  ring  with  identity  such  that 
every  element  is  idempotent,  that  is,  for  every  element  a. 


it  follows  from 


a2  =  a- a  =  a. 


(a  +  b)2  =  a  +  b  +  ab  +  ba 
=  a  +  b 


that  ab  =  -ba.  Taking  a  =  b  gets  a  -  -a  so  that  a  +  a  =  0,  that  is,  the  ring  has 
characteristic  2,  and  is  also  commutative.  The  identity  (or  unit,  or  unity)  of  R  is  denoted 
I,  as  usual.  Two  additional  "logical  operations"  are  defined  on  Boolean  rings,  V  (called 
or,  or  union,  or  conjunction )  and  '  (called  not ,  or  negation ,  or  complement),  by 

aVb  =  a  +  b  +  ab, 
and 

a'  =  1  +  a, 

respectively.  Disjunction,  or  intersection,  or  and,  sometimes  denoted  by  A,  is  taken  to  be 
the  multiplication  on  R.  A  partial  order  <  is  defined  by  a  <  b  if  ab  =  a.  The  generic 
example  of  a  Boolean  ring  is  the  set  of  all  subsets  of  a  set  Q,  with  +  and  •  given  by 
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symmetric  difference  (the  "exclusive  or")  and  intersection,  that  is,  by 


and 


a  +  b-  ah'  u  a'b 
a-b  =  a  nb. 


where  '  is  complementation,  and  u  and  A  are  ordinary  union  and  intersection  of  sets. 
The  identity  1  is  the  set  Cl  and  the  zero  is  the  empty  set  $.  The  partial  order  then  is 
just  ordinary  containment  of  sets.  This  ring  is  called  the  ring  of  all  subsets  of  Cl,  and  is 
particularly  pertinent  when  fit  is  a  finite  set  More  generally,  a  set  of  subsets  R  of  an 
arbitrary  set  Cl  which  is  a  ring  under  the  operations  given  by 


and 


a  +  b  =  ab'  V  a'b 
a-b  -  a  Kb 


is  a  Boolean  ring,  and  Stone's  Representation  Theorem  says  that  every  Boolean  ring  is 
isomorphic  to  such  a  ring  of  subsets  of  some  set 

An  important  concept  is  that  of  an  ideal  of  a  Boolean  ring  R.  More  generally,  an 
ideal  of  an  arbitrary  commutative  ring  R  is  a  nonempty  subset  I  of  R  such  that  a-b 
is  in  7  for  all  a  and  b  in  7,  and  a-b  is  in  I  for  all  a  in  R  and  b  in  .7.  If  R  is 
Boolean,  then  b  =  -b,  and  the  condition  that  a-b  is  in  I  becomes  simply  that  a  +  b 
is  in  7.  So  an  ideal  in  a  Boolean  ring  R  is  simply  a  nonempty  subset  of  R  closed  under 
addition  and  closed  under  multiplication  by  elements  of  R.  An  ideal  is  a  principal  ideal 
if  it  is  of  the  form 


Ra  =  {ra  :  r  e  R). 

Such  ideals  will  be  of  particular  importance  for  us. 

For  an  ideal  I  of  R,  there  is  associated  a  ring  Rfl,  called  a  quotient  ring,  whose 
elements  are  cosets,  that  is  subsets  of  R  of  the  form 

a  +  I  =  [a  +  i :  i  e  /}, 

and  addition  and  multiplication  are  given  by 

(a  + 1)  +  (b  + 1)  =  {a  +  b)  + 1 
and 

(a  -f  /)•(&  +  I)  -  a-b  +7, 

respectively.  It  is  an  easy  exercise  to  show  that  this  makes  R/I  into  a  ring,  using  the 
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properties  of  an  ideal  /.  Further,  R/I  is  Boolean  when  R  is  Boolean.  Here,  as  is  the 
custom,  we  are  using  +  and  •  for  addition  and  multiplication  in  both  the  rings  R  and 
R/I,  but  the  context  will  make  it  clear  where  we  are  doing  our  adding  and  multiplying. 

A  mapping  /  from  a  ring  R  to  a  ring  S  is  a  homomorphism  if  for  a  and  b  in 

R, 

Ka + b)  =m  +m 

and 

Ka-b)  =Aa)-m. 

For  an  ideal  I  of  a  ring  R,fa)  =  a  + 1  is  a  homomorphism  from  the  ring  R  onto 
the  ring  R/I,  call  the  natural  homomorphism  of  R  onto  R/I.  Two  rings  R  and  S  are 
isomorphic  if  there  is  a  homomorphism  from  R  to  S  that  is  one-to-one  and  onto.  If  / 
is  a  homomorphism  from  R  to  S,  then 

Kerif)  =  [a  :fa)  =  0} 

is  called  the  kernel  of  /,  and  is  an  ideal  of  R.  If  /  is  from  R  onto  S,  then 

F(  a  +  Kerif))  =.fa) 

is  an  isomorphism  from  R/ Kerif)  to  S.  This  is  the  first  isomorphism  theorem  for  rings. 
That  F  is  one-to-one  from  R/Kerif)  onto  f{R)  is  a  special  case  of  this.  For  any 
mapping  f  defined  on  a  set  X,  x  ~  y  given  by  fx)  =  fly)  is  an  equivalence  relation. 
Let  F(x)  denote  the  equivalence  class  to  which  x  belongs,  and  let  the  set  of  equivalence 
classes  be  denoted  by  XI\fJ.  This  set  of  equivalence  classes  is  a  partition  of  X,  the  map 
F  :  X  -*  Xl\f)  is  the  natural  map  from  X  onto  Xl\f\,  and  7] :  Xl[f\  -*flX)  given  by 
q(F(x))  =  fix)  is  one-to-one  and  onto. 

If  R  is  a  ring  and  /  is  a  one-to-one  mapping  from  R  onto  a  set  S,  then  S  can  be 
made  into  a  ring  isomorphic  to  R.  For  example,  multiply  in  S  by 

x-y=f(f-'V)-f-m- 

A  probability  measure  P  on  a  Boolean  ring  R  is  a  function  P  from  R  to  the 
closed  interval  [0,/]  such  that  P(l)  =  1  and  P(a  V  b)  =  P{a)  +  P(b)  whenever  ab  =  0. 
This  last  property  is  the  finite  additivity  of  P.  There  is  a  stronger  property  sometimes 
required,  called  a-additivity,  but  we  will  not  need  it.  An  atom  in  a  Boolean  ring  is  an 
element  a  such  that  a  £  0  and  if  b  <  a,  then  b  =  a  or  b  =  0.  The  ring  is  atomic  if 
every  element  contains  an  atom.  Finite  Boolean  rings  axe  always  atomic,  and  the  ring  of 
Borel  sets  of  Euclidean  space  is  atomic.  If  a  is  an  atom  in  a  Boolean  ring,  then  P 
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defined  by  P(b)  =  1  if  a  <  b  and  P{b)  =  0  otherwise,  is  a  probability  measure  on  R.  If 
a  is  not  an  atom,  then  P  is  not  a  probability  measure,  since  for  0±c<a,  P(ac')  =  0  = 
P(a)  =  P(ac),  hence  additivity  fails.  (See,  however  Section  2.2) 

2.2  Conditioning  operators 

As  stated  in  Chapters  0  and  1,  the  main  goal  is  to  define  objects  of  the  form  "a 
given  b",  denoted  (c|h),  for  a  and  b  elements  of  a  Boolean  ring  R.  Although  the 
operation  (■  |  •)  on  RxR  is  termed  measure-free  conditioning,  the  derivation  implicitly 
involved  probability  measures.  We  wish  to  define  (a|&)  in  such  a  way  that  for  any 
probability  measure  P  on  R,  it  is  possible  to  assign  the  conditional  probability  P(a\b) 
to  ia\b)  without  ambiguity.  In  this  spirit,  the  theory  of  measure-free  conditioning 
developed  here  is  compatible  with  probability  theory.  If  this  compatibility  condition  is 
relaxed,  then  the  door  is  open  to  ether  types  of  conditional  objects.  Lewis'  Triviality 
Result  is  established  precisely  within  this  probability  compatibility  condition.  The 
probability  compatibility  requirement  is  appealing,  for  example,  in  expert  systems  since  th 
strengh  of  the  production  rule  "if  b  then  a"  is  usually  quantified  by  the  conditional 
probability  P(a\b).  A  typical  example  is  the  Markov  random  field  model  of  Lauritzen 
and  Speigelh alter  (1988).  But  measure-free  conditional  events  compatible  with  probability 
can  be  used  to  investigate  other  non-probabilitistic  conditioning  as  well  The  recent  work 
of  Dubois  and  Prade  (1991)  is  relevent. 

In  view  of  previous  work  on  measure-free  conditionals,  it  seems  that  the  coset  form 
is  a  reasonable  one.  In  the  following,  we  will  arrive  at  this  form  from  an  axiomatic 
approach. 

We  are  going  to  search  for  maps  f:RxR  onto  some  space  S  which  captures  the 
basic  aspects  of  conditioning  compatible  with  conditional  probability  evaluations.  A  value 
(a.b)  of  /  will  be  called  a  (measure-free)  conditional  event.  Our  strategy  is  this.  The 
mapping  /  will  be  required  to  satisfy  a  set  of  axioms,  or  requirements.  Since  S  is 
unknown,  we  will  work  on  the  domain  RxR  of  /,  examining  the  partition  on  it  induced 
by  the  equivalence  relation  (a.b)  z  ( c,d)  if  f(a,b)  =  f(c,d).  A  version  of  /  is  then 
obtained  by  assigning  to  each  (a.b)  its  equivalence  class  F(a,b).  Note  that 

(R  x  R)l[f]  -tf(R  x  R ) :  F(a,b)  -*f(a,b) 

is  a  one-to-one  correspondence.  Thus  if  all  /  satisfying  the  axioms  induce  the  same 
partition  of  RxR,  then 


F:RxR-<(Rx  R)!\f] 
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is  a  canonical  form  for  /.  This  program  will  be  carried  out  by  examining  what  each  f{.,b ) 
must  be  like,  for  each  b  in  R. 

Any  "conditioning"  operator  /  should  at  least  possess  the  following  three  properties. 

(1)  f[R,l)  is  a  copy  of  R,  that  is,  f(.,l)  is  one-t  vne.  This  satisfies  the 
requirement  that  conditional  events  should  be  generalizations  of  ordinary  events. 

(2)  For  {a,b)  in  R  x  Ryf{a,b)  =  f(ab,b).  This  says  that  conditioning  a  on  b  is 
the  same  as  conditioning  ab  on  b. 

(3)  For  any  probability  measure  P  on  R,  Q((f{a,b))  =  P(a\b)  defines  an  extension 
of  P  to  Wp  =  [fia,b)  :P(b)  >  0). 

That  Q  is  well  defined  requires  that  for  f{a, b)  and  f[c,d)  in  Wp  with 
fia,b)  =  f(c,d),  one  should  have  P(a\b)  =  P(c\d).  For  Q  to  be  an  extension  of  P  also 
requires  that  R  be  contained  in  Wp.  A  weaker  form  of  (3)  is  this. 

(3’)  lff(a,b)  =  f(c,b )  then  Piflb)  -  P(cb)  for  all  P  for  which  P(b)  >  0. 

By  Lemma  1  below.  (3')  is  equivalent  to 
(3")  If  f{a, b)  =Ac,b),  then  ab  =  cb. 

It  is  reasonable  to  postulate  that  conditional  events  with  different  antecedents  are 
different,  namely 

(4)  If  f[a,b)  =  then  b  =  d. 

A  weaker  form  of  (4)  is 

(4’)  if  f(0,b)  =  f[c,d),  then  b  =  d,  and  if  f{b,b)  =  f{c,d),  then  b  =  d. 

In  conjunction  with  (3)  or  (3'),  (4’)  becomes 

(4")  If  f{0,b)  =  f{c,d),  then  b  =  d  and  cd  =  0,  and  if  f{b,b )  =  f(c,d ),  then  b  =  d 
and  cd  -  d. 

We  need  some  technical  lemmas.  To  construct  Dirac-like  probability  measures  on 
arbitrary  Boolean  rings,  we  spell  out  the  following  procedure. 

Lemma  0.  Let  R  be  a  Boolean  ring.  Let  a\,  a%,  .  .  . ,  a^  be  non-zero  mutually  disjoint 
elements  of  R,  and  let  oq,  a2»  •  •  •  >  be  in  [0,i]  with  Z  oq  =  I.  Then  there  is  a 
probability  measure  P  on  R  such  that  if  a;  S  b,  and  (V-gy  a\)  =  0,  then  P(b)  = 

V/“i- 
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Proof.  By  Stone's  Representation  Theorem,  we  may  identify  R  with  a  subalgebra 
of  the  algebra  of  all  subsets  of  some  set  LI.  Thus  aj  is  a  subset  of  Ll.  Since  each  a\  is 
nonempty,  pick  e  Cj,  and  let  P  =  Ii  cq$,  where  is  the  Dirac  probability  with  mass 
one  at  (0[,  given  by  5^(b)  =  1  if  6  b  and  0  otherwise.  It  is  easy  to  check  that  P  is 
a  probability  measure  on  R  with  the  desired  property. 

Lemma  1.  Let  R  be  a  Boolean  ring.  If  P(ab)  =  P(cb)  for  all  probability  measures  P 
on  R  such  that .  P(b)  >  0,  then  ab  =  cb. 

Proof.  Suppose  that  ab  *  cb.  Then  either  ( ab)(cb ')  *  0  or  (ab) '  (cb)  *  0. 
Suppose  ( ab)(cb ')  *  0.  In  view  of  Lemma  0,  let  fit)  e  (ab)(cb')  *0,  coe  Ll.  Let  P  be 
the  Dirac  probability  measure  on  the  set  of  subsets  of  LI  with  mass  1  at  co.  Then 
P(ab)  =  1,  P(b)  >  0,  and  P(ab)  *  P(cb)  =  0.  o 

Lemma  2.  Let  R  be  a  Boolean  ring,  and  let  a,  b.  c,  d  be  elements  of  R  with  b*0*  d. 
The  following  are  equivalent. 

(i)  P(a\b)  <  P(c\d)  for  all  probability  measures  P  on  R  for  which  P(b)#0* 

P(d). 

(ii)  Either  ab  =  0,  or  d<c,or  ab<  cd  and  c'd  <,  a'b. 

Proof.  Assume  (ii).  If  ab  -  0  or  d<  c,  then  obviously  (i)  holds.  Suppose  that 
ab<cd  and  c'd  <  a'b.  Using  those  two  inequalities  and  the  fact  that  for  t>0  and 
P(y)  Z  P(x), 


P(x)/P(y)  <  [P(x)  +  i]/[P(y)  +  t], 

we  get 

P(a\b)  =  P(ab)/P(b) 

<  (P(ab)  +  P(cd)  -  P(abcd)]/[P(b)  +  P(cd)  -  P(abcd)} 

=  P(cd)/P(b  V  cd)  =P(cd)IP(ab  V  a'b  V  cd) 

=  P(cd)/P(a'b  V  cd)<  P(cd)IP(c'd  V  cd) 

=  P(cd)IP(d)  =  P(c\d). 

Now  assume  (/),  and  that  neither  ab  =  0  nor  d  <  c.  Then  ab  *  0  and  c'd#0. 
First,  we  get  ab  <  cd.  If  not,  then  (ab)(cd)'  *  0.  If  (ab)(cd)'d  *  0,  view  R  as  a 
subalgebra  of  the  algebra  of  subsets  of  a  set  iQ,  and  let  co  e  Q  with  co  e  (ab)(cd)'d.  The 
restriction  of  the  Dirac  probability  measure  P  on  the  set  of  all  subsets  of  Q.  given  by 
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P(o>)  =  1  has  the  property  that 

P{b)  =  Pid)=P(flb)  =  l. 

and  P(cd)=0.  Thus  P(a\b)  =  l  while  P(c|d)  =  0. 

If  (ab)(cd)'d  =  0,  let  y  and  co  be  elements  of  £2  such  that  ye  (ab)(cd)'  and 
to  e  c'<£  Note  that  y*  ca.  Givipg  y  and  co  each  mass  1/2  yields  a  probability  measure 
on  the  set  of  all  subsets  of  £2  whose  restriction  P  to  R  has  the  property  that  P(a\b)  = 
P(ab)/P(b)  Z 112 ;  while  P(c\d)  =  0. 

The  proof  that  c'd  £  a'b  is  similar.  o 

The  following  corollary  is  immediate. 

Corollary  1.  Let  R  be  a  Boolean  ring  and  a,  b,  c,  d  be  in  R  with  b  *  0  *  d.  The 
following  are  equivalent. 

(z)  P(a |  b)  =  P(c\d)  for  all  probability  measures  P  on  R  for  which  P(b)  *  0  * 

P(d). 

(zz)  Either  ab  =  cd  =  0,  or  b<.a  and  d<,c,  or  ab  =  cd  and  b  =  d. 

In  view  of  Lemma  1,  we  see  that  if  /  satisfies  (3),  then  it  satisfies  (1).  Indeed,  if 
f{a,  1 )  =  f(c,l),  then  P(a)  =  P(c)  for  all  probability  measures  P  on  R,  and  hence  a  =  c. 
Thus  f(-,l)  is  one-to-one  on  R,  and  R  is  identified  with  f(R,l).  Also  it  follows  from 
Lemma  1  that  if  f{a,b)  =  fc,d),  then  ab  =  cd.  Thus  (2)  and  (3)  are  the  basic 
requirements  for  conditioning  operators. 

Theorem  1.  If  f  satisfies  (2)  and  (3),  then  for  each  b,  Rl[ff.,b)]  =  R/Rb'. 

Proof.  It  suffices  to  show  that  Rb'  and  the  kernel  of  /(•,&)  define  the  same 
equivalence  relation  on  R.  Let  a  and  c  be  in  R.  Then  f(a,b)  =  f(c,b)  if  and  only  if 
f(ab,b )  =f{cbjb)  (by  (2))  if  and  only  if  ab  =  cb  (by  Lemma  1)  if  and  only  if  a  +  Rb'  = 
c  +  Rb' .  o 

Some  remarks  are  in  order.  First,  since  f[R,b)  and  R/Rb'  are  in  one-to-one 
correspondence,  in  fact  by  a  +  Rb'  -» f(a,b),  and  R/Rb'  is  a  ring,  f(R,b)  becomes  a  ring 
isomorphic  to  R/Rb'.  Second,  note  that  the  mapping 

RxR-*R/Rb'  :  ( a,b)-<a  +  Rb ' 

does  satisfy  (2)  and  (3).  See  Section  2.3  for  more  details. 
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It  remains  to  describe  all  /  satisfying  (2)  and  (3).  Theorem  1  gives  a  description 
for  such  /  locally,  that  is  of  each  f{‘,b).  Each  /(*,  b)  induces  the  same  partition  on  R, 
namely  into  cosets  of  R/Rb'.  However,  two  such  /  do  not  necessarily  induce  the  same 
partition  on  RxR.  In  fact,  let  /  be  defined  on  RxR  by  f{a,b)  =  a  +  Rb',  and  define 
g  by  g=f  except  that  g{a,b)  =  R  whenever  a  hb  =  0.  This  does  define  a  function.  If 
f(a,b)  =  f(c,d)  and  a  hb  =  0,  then  c  K  d  =  0.  Now  j\-,b)  and  g{-Jb)  both  induce  the 
partition  of  R  into  cosets  of  Rb',  but  do  not  induce  the  same  partition  of  R  xR,  as  is 
obvious.  Furthermore,/  and  g  satisfy  (2)  and  (3).  The  problem  is  that  g(a,b)  =  g(c,d) 
does  not  imply  that  b  =  d.  If  this  latter  condition  is  satisfied,  that  is,  if  we  assume 
property  (4)  in  addition  to  (2)  and  (3),  then  any  two  such  / s  are  equivalent  in  the  sense 
that  they  determine  the  same  partition  of  R  x  R.  Condition  (4)  makes  the  f(R,b)'s  be 
mutually  disjoint,  and  (2)  and  (3)  make  each  be  in  one-to-one  correspondence  with 
R/Rb'.  So  any  /  satisfying  (2),  (3),  and  (4)  is  equivalent  to  the  map  defined  by 
(a,b)  -t  a  +  Rb' .  Thus  we  have  the  following  theorem. 

Theorem  2.  Let  f  satisfy  (2),  (5),  and  (4).  Then/ is  equivalent  to  the  map  g  defined  by 
g{a,b)  =  a  +  Rb' .  o 

If  (2)  and  (3)  are  assumed,  then  something  a  little  weaker  than  (4)  will  suffice. 

Theorem  3.  The  the  conditions  (2),  (5),  and  (4’)  imply  (4).  That  is  (2),  (3),  and  (4’)  are 
equivalent  to  (2),  (3),  and  (4). 

Proof.  Assume  (2),  (3),  and  (4').  Suppose  that  f{a,b)  =  f[c,d).  Then  there  is  a 
probability  measure  P  on  R  such  that  P(b)  *  0  #  P{d).  By  Lemma  2,  either  (i)  b  =  d, 
or  (ii)  b  <a  and  d<  c,  or  (iii)  ab  =  cd  =  0.  In  case  (ii), 

f(a,b)  =f(ab,b)  =M>,b)  =f{c,d), 
and  (4‘)  implies  that  b  =  d.  In  case  (iii), 

f(a,b)  =f(0,b)  =fic,d), 

and  (4’)  implies  that  b  =  d.  Thus  b  =  d  in  any  case,  and  the  theorem  follows.  □ 

2.3  Conditional  events 

The  analysis  of  Section  2.2  has  led  us  to  a  canonical  form  for  conditional  events. 
This  form  will  be  used  throughout  this  book. 


Conditional  events 


51 


Definition.  Let  R  be  a  Boolean  ring.  For  a  and  b  in  R,  the  (< measure-free ) 

conditional  event  "a  given  b",  written  (a\b)  is  the  coset  a  +  Rb'.  The  space 

Ur  DRIRb'  of  all  conditional  events  is  denoted  by  R\R.  It  is  sometimes  referred  to  as 
btK 

the  conditional  extension  of  logic. 

As  we  will  see,  the  union  u^  RfRb'  above  is  a  disjoint  one.  That  is, 
(R/Rb')  n  (R/Rd')  =  (j)  for  b  *  d.  The  function  fa,b)  =  a  +  Rb'  satisfies  all  the 
requisite  properties  discussed  in  the  last  section,  including  the  property  that  f{a,b)  =f(c,d) 
implies  that  P(a\b)  =  P(c\d).  There  are  many  "conditioning  operators"  which  are  not 
"probability  compatible".  Examples  are 

f(a,b )  =  ab, 
and 

f(a,b)  =  (b  ->  a)  =  b'  V  a. 

More  generally,  take 

f(a,b )  =  abV  db' 

for  any  d  in  R.  Then  for  d  =  0,  we  get  f{a,b)  =  ab ,  and  for  d-1,  we  get  fa.b)  - 
b'  v  a.  These  cannot  be  compatible  with  probability  by  Lewis’  Triviality  Result  in 
Section  1.1. 

We  will  now  look  at  some  of  the  properties  of  (a\b).  The  function  f[a,b)  = 
a  +  Rb'  on  RxR  will  be  denoted  by  (*|-)-  Thus  (•  |  •)  is  a  function  from  RxR 
onto  v^RR/Rb'  =  R\R. 

(1)  The  function  (•  |fr)  is  a  homomorphism  from  the  ring  R  onto  the  quotient  ring 
R/Rb'.  This  quotient  ring  consists  of  all  cosets  of  the  form  a  +  Rb',  ox  (a\b),  b  fixed. 
These  cosets  partition  R,  that  is,  two  cosets  (a\b)  and  (c|b)  are  equal  or  disjoint,  and 
every  element  of  R  is  in  some  (a | b).  In  fact,  a  is  in  (a\b).  Thus  to  check  that  two 
cosets  (a  |  b)  and  (c  |  b)  are  equal,  it  is  enough  to  get  one  element  in  common.  Note  that 
(0|0)  =  0  +  R  =  R  is  a  coset  and  leads  to  the  trivial  quotient  ring  RIR,  a  ring  with  only 
one  element. 

(2)  (-|i)  is  one-to-one  on  R,  and  in  fact  is  an  isomorphism  from  R  to 
(R 1 1)  =  R/RO,  which  is  identified  with  R  itself,  cosets  of  RO  =  {0}  being  of  the  form 
a+{0}  =  {a}. 
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(3)  Since 

a  +  Rb'  =  a  +  ab'  +  Rb' 

=  a(l+h')  +  Rb' 

=  ab  +  Rb', 

we  get  that  (a  |  b)  =  (ah  \  b).  This  is  just  property  (2)  in  Section  2.2. 

(4)  In  R,  a  closed  interval  [a,h]  consists  of  all  c  such  that  a  <  c  <  b,  and  recall 
that  x  <  y  if  xy  =  x.  When  we  write  [a,h],  we  mean  implicitly  that  a  <b.  Cosets  of 
principal  ideals  in  Boolean  rings  and  closed  intervals  are  the  same  thing.  In  fact,  (a|h)  = 
[ab,b  -*  a],  or  [ah,aVh'],  and  any  closed  interval 

[a,b]  =  (a\b'  V a)  =  (ah[h'  V a). 

To  see  this,  for  a  +  rb'  in  a  +  Rb', 


ab{a  +  rb')  =  ab, 
and 

(a  +  rb'ffl'  V  a)  =  a  +  rb'(b'  V  a)  =  a  +  rb'. 

Thus  (a\b)  c [ab,b'  V a].  For  ab<c<b'  V a, 

c  =  ab  V  ( c(ab )')  =  ab  +  ca'b' , 

which  is  in  ab  +  Rb'  =  a  +  Rb'.  Thus  (a\b)  =  [ab.b'Va].  For  an  interval  [a,b]  = 
[ab,b], 


(ab\b'  V  a)  =[abV  ( b '  V  a),(b'  V  a)'  V  (ah(b'  V  a))]  = 

[ab,(jba')  V  ((ah)  A  (&'  V  a))]  =  [ah, (6  A  a')  V  (ha)]  =  [ah,h]. 

This  fact  that  cosets  of  principal  ideals  and  closed  intervals  are  the  same  things 
gives  nothing  new  except  the,  perhaps  important,  realization  of  conditional  events  as 
intervals  [a,h].  Such  an  interval  has  a  ready  interpretation,  in  fact,  a  ready  meaning  -  the 
set  of  all  elements  of  R  between  a  and  h.  (Remember  ,  a  <  b.)  The  interval  [a,h]  is 
the  conditional  event  "a  given  (h'  V  a)",  and  the  interval  [ah,  b'  V  a]  is  the  conditional 
event  "a  given  h",  or  (a  j  h).  Thinking  of  a  conditional  event  as  an  interval  has  perhaps 
more  intuitive  appeal  than  thinking  of  it  as  a  coset  a  +  Rb'.  In  any  case,  it  is  convenient 
sometimes  to  visualize  a  +  Rb'  as  all  sets  between  ab  and  aVh'. 
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The  following  property  of  cosets  is  fundamental  enough  for  us  to  be  called  a 
theorem. 


Theorem  1.  The  two  cosets  a  +  Rb'  and  c  +  Rd'  of  R  are  equal  if  and  only  if  ab  = 
cd  and  b  =  d. 

Proof.  If  ab  —  cd  and  b  =  d,  then  clearly  the  two  cosets  are  equal.  Now  suppose 
that 

a  +  Rb'  =  c  +  Rd'. 

Then 

ab  =  cd  +  rb' , 
so 

abb  =  cdb  +  rb'b  -  ab  -  cdb. 


Thus  ab  <  cd.  By  symmetry,  cd  <  ab,  whence  ab  =  cd.  Now 

ab  +  b'  =  cd+  s d'  =  ab  +  sd' , 

so  b'  -  sd',  whence  b'  <  d'.  By  symmetry,  d'  <  b',  so  b  =  d.  o 


This  theorem  exhibits  all  the  relevant  properties  of  our  conditioning  operator  (•]■). 
It  asserts  that  ( a\b )  =  (c|d)  if  and  only  if  ab  =  cd  and  b-d.  In  particular,  if  (a\b) 
=  (c\d),  then 


P(a\b)  =  P(ab)/P{d)  =  P(c\d). 

If  a  function  /  is  equivalent  to  (•  |  •)  in  the  sense  of  Theorem  2  of  Section  2.2, 
then  /  has  the  property  that  f(a,b)  =  f{c,d)  if  and  only  if  ab  =  cd  and  b-d.  Further, 
any  function  /  having  this  property  is  equivalent  to  (•  |  •). 

There  are  two  forms  for  conditional  events  that  are  equivalent  to  ours  that  are  worth 
considering.  First  is  the  form  proposed  by  Schay  (1968)  and  DeFinetti  (1972).  Let  R  be 
a  ring  of  subsets  of  some  set  Cl,  and  define  g  on  R  x  R  by  g(a,b){co)  =1  if  co  is  in 
ab,  0  if  co  is  in  a'b,  and  u.  for  co  in  b' .  That  is,  g (a,b)  is  a  function  from  fit  to 
{0,l,u),  where  the  "u"  stands  for  "undefined".  Clearly  g (a,b)  =  g (c,d)  if  and  only  if 
ab  =  cd  and  b  =  d,  so  g  is  equivalent  to  (•[•)• 

Another  form  for  conditional  events  that  is  equivalent  to  the  coset  one  adopted  here 
is  given  by  f(a,b)  =  ( ab.b ).  Clearly,  f(a, b)  =f(c,d)  if  and  only  if  ab  =  cd  and  b  =  d. 
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Thus  conditional  events  are  pairs  ( a,b )  with  a  <  b.  This  form  has  the  appealing 
interpretation  that  conditional  events  are  events,  ( a,b )  being  viewed  as  the  event  a  in  the 
subring  Rb  of  R,  and  this  being  viewed  as  different  from  the  event  a  in  the  ring  R. 
That  is,  it  is  the  pair  ( a,b ).  The  space  in  which  it  is  an  event  must  be  kept  track  of. 
Another  advantage  of  this  realization  of  conditional  events  is  that  pairs  are  simpler  to 
visualize  and  to  manipulate  than  cosets. 

We  remark  that  ( a  |  b)  can  be  realized  as  the  set  of  all  solutions  x  of  the  equation 
xb  =  ab,  which  is,  of  course  all  those  elements  between  ab  and  a  V  b' .  Using  other 
binary  operations  than  (p(a,b)  =  ab,  and  considering  the  set  of  all  solutions  of  the 
equation  <p(x,h)  =  (p{a,b)  gives  other  formulations  of  conditional  events,  and  a  way  to 
extend  the  concept  to  algebraic  structures  more  general  than  Boolean  rings.  (See  Chapter 
8.) 

Conditional  events  (<s | b),  that  is  cosets  a  +  Rb',  can  be  expressed  in  terms  of 
filters  of  R.  A  filter  in  the  Boolean  algebra  R  is  a  non-empty  subset  F  of  R  such  that 

if  a  and  b  are  in  F,  then  ab  a  F,  and  if  a  e  F  and  a  <  b,  then  b  6  F.  The  relation 

between  ideals  and  filters  is  that  F  is  a  filter  in  R  if  and  only  if  F'  =  [1  +  x  :  x  6  F} 
is  an  ideal  in  R.  Given  a  filter  F,  an  equivalence  relation  is  defined  by  a  =  b  if  there  is 
an  element  /  6  F  with  af  =  bf.  Letting  [a]  denote  the  equivalence  class  containing  a, 
the  relation  with  cosets  is  expressed  in  the  equation 

[a]=a  +  F' 

For  principal  ideals,  the  situation  is  particularly  simple.  For  for  b  6  R,  the  set 
R  V  b  =  [r  V  b  :  r  6  R)  is  a  filter.  It  just  consists  of  all  elements  x  such  that  b  <  x. 

Further,  (R  V  b)'  =  Rb',  and  so  in  this  case, 

(a\b)  =  [a]  =  a  +  Rb'. 

Now  that  we  have  conditional  events  ( a\b )  identified  as  cosets  a  +  Rb'  of  R,  we 
must  establish  logical  operations  between  them,  and  this  will  be  caarried  out  in  Chapter  3, 
where  the  ring  operations  of  R  will  be  extended  to  its  cosets.  However,  conditional 
events,  as  subsets  of  R,  can  be  combined  via  union  and  intersection  as  well  as  other 
ordinary  set  operations.  These,  of  course,  are  not  extensions  of  the  ring  operations  of  R, 
but  may  be  of  some  interest  in  their  own  right.  The  ordinary  set  operations  on  conditional 
events  with  the  same  antecedent  are  of  course  well  understood,  since  the  cosets  of  an  ideal 
Rb'  partition  R.  For  example,  the  intersection  of  two  such  cosets  is  either  empty  or  the 
cosets  are  identical.  For  cosets  with  different  antecedents,  the  situation  is  a  bit  more 
complex. 


Conditional  events 


55 


Theorem  2.  The  following  hold. 

(1)  ( a\b )  a  (c\d)  is  the  coset  (( ab  V  cd)\(b  V  d))  if  abd  =  cbd,  and  is  empty 
otherwise. 

(2)  (a\b)c(c\d)  if  and  only  if  cd<ab  and  aVb'ccVd'. 

(3)  (a\b)  u  (c\d)  is  a  coset  if  and  only  if  one  is  contained  in  the  other,  or 

ab^cd^aV  b'  £  c  V  d' , 
or 

cd£ab£cy  d'  £aV  b' . 

In  the  last  case,  for  example, 

( a\b )  u  (fi\d)  =  (cd\(cd  V  a'b )). 

Proof.  (2)  If  a  +  rb'  =  c  +  rd\  then  multiplying  through  by  bd  gets  abd  =  cbd. 
this  latter  equality  implies  easily  that 

ab  V  cd  £  (a  V  b')  A  (c  V  d'). 

The  coset  (a|6)  is  the  interval  [ab,  aVb']  and  (c\d)  =  [cd,  cVd'].  It  follows  that 

(a\b)  n  (c\d)  =  [ab,  a  V  b']  a  [cd,  c  d'] 

=  [abV  cd,  (a  V  b ')  A  (cV  d')] 

=  (( ab  V  cd)|(&  V  d)), 

again  using  abd  =  cbd. 

Viewing  cosets  as  intervals  immediately  yields  (2)  and  (3).  □ 
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CHAPTER  3 

LOGICAL  OPERATIONS  ON  CONDITIONAL  EVENTS 

In  this  chapter,  logical  operations  between  conditional  events  are  defined,  extending 
Boolean  operations  of  the  base  ring  R.  As  in  most  extension  problems,  such  an  extension 
is  not  unique,  and  the  one  chosen  demands  justification.  From  a  semantic  viewpoint,  the 
system  of  logical  operations  derived  here  corresponds  to  Lukasiewicz's  three-valued  logic. 
A  comparison  with  other  proposed  operations  is  given  in  Section  3.5.  A  discussion  of  the 
possibility  of  deriving  logical  operations  for  conditional  events  in  an  axiomatic  setting  is 
in  Section  3.4.  The  analysis  in  this  chapter  is  directed  toward  Boolean  rings,  with  more 
general  algebraic  structures  considered  in  Chapter  8. 

3.1  The  extension  problem 

As  established  in  Chapter  2,  for  a,  be  R,  by  the  conditional  event  "a  given  b",  we 
mean  the  coset  a  +  Rb' ,  and  use  the  notation  (a\b)  for  it  Since  conditional  events  are 
generalizations  of  events,  with  (a|i)  corresponding  to  the  ordinary  event  a  in  R,  the 
logical  operations  among  them  should  be  extensions  of  the  ring  operations.  That  is 
(ajl)  +  (b\l)  must  be  (( a  +  b)\l),  and  so  on.  There  are  various  ways  of  doing  this.  It 
has  been  noted  at  the  end  of  Chapter  2  that  ordinary  set  operations  on  conditional  events 
are  not  appropriate.  The  space  R\R  of  conditional  events  is  the  disjoint  union  u  RIRb't 
with  the  union  over  all  b  e  R.  We  have  in  each  RJRb'  the  usual  quotient  ring  operations 
which  come  from  the  operations  of  R.  What  is  needed  are  operations  combining  cosets 
from  different  quotient  rings,  that  is,  combining  elements  from  R/Rb'  and  RlRd'  with 
b  *  d,  and  of  course  with  the  result  of  such  a  combination  being  a  coset  of  a  principal 
ideal.  This  is  not  a  standard  ring  theory  operation,  and  has  been  largely  avoided.  For 
example,  Hailperin  (1976)  just  called  R\R  a  partial  universal  algebra  (see,  for  example 
Gratzer,  1968),  and  considered  logical  operations  only  between  elements  of  the  same 
quotient  ring.  That  is  clearly  unsatisfactory.  We  will  define  operations  between  any  two 
cosets  of  principal  ideals,  and  investigate  the  resulting  algebraic  structure  of  R\R  in 
Chapter  4. 

For  any  ring  R,  its  operations  +  and  •  induce  corresponding  operations  on  subsets  of 
R.  Namely,  for  subsets  A  and  B  of  R, 
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A  +  B  =  {a  +  b :  a  e  A,  b  e  B), 
and 

AB  =  {ab  :aeA,beB } 

We  have  two  other  commonly  used  operations  for  Boolean  rings,  aSb  =  a  +  b  +  ab,  and 
a'  =  1  +  a.  These  extend  to  set  operations  as  well,  namely  '  - 

A  V  B  =  {a  +  b  +  ab  :  a  e  A,  b  e  £},  and 

A'  =  [a' :  a  e  A} 

A  convenient  fact,  and  one  easily  checked,  is  that  for  subsets  A  and  B  of  R,  DeMorgan’s 
laws  hold: 


(AB)'  =  A'  SB', 
and 

(A  SB)'  =  A'B'. 

We  have  already  been  using  set  addition  in  writing  down  cosets:  a  +  Rb'  means 
[a]  +Rb',  which  is  {a  +  rb'  :  re  R}.  Now  coset  addition  in  each  quotient  ring  RfRb' 
is  just  this  set  addition.  Cosets  of  RfRb'  are  added  by  the  formula 

(a  +  Rb')  +  (c  +  Rb')  =  (a  +  c)  +  Rb', 
but  this  coincides  with  the  set  addition  above  since 


(a  +  rb')  +  (c  +  sb')  =  (a  +  c)  +  (r  +  s)b' 

is  in  (a  +  c)  +  Rb',  and  the  other  inclusion  is  equally  as  easy  to  check.  Further,  this 
addition  is  well  defined  -  set  addition  is  certainly  well  defined,  and  if 

a  +  Rb'  =x  +  Rb' 


and 


then 


c  +  Rb'  =  y  +  Rb', 


(a  +  c)  +  Rb'  =  (x  +  y)  +  Rb'. 


These  remarks  for  coset  addition  are  valid  for  any  ring  and  any  ideal  I,  not  just  Boolean 
rings  and  principal  ideals  Rb'. 

It  is  not  generally  true  that  coset  multiplication  is  set  multiplication.  That  is,  it  is 
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not  true  for  all  rings  that  set  multiplication  of  (a  + 1)  and  (b  + 1)  is  ab  + 1,  the  product 
of  the  two  cosets  (a  +  7)  and  (b  +  7).  However,  it  is  true  for  Boolean  rings,  and  in  fact 
for  commutative  von  Neumann  regular  rings.  For  Boolean  rings,  an  arbitrary  element  of 
(< a  +  I)(b  +  7)  is 


(a  +  i){b  +j)  =  ab  +  aj  +  bi  +  ij 

with  i  and  j  in  I.  This  is  clearly  in  ab  + 1.  On  the  other  hand  for 

ab  +  he  ab  + 1, 

taking  i  =  ka'  and  j  =  k  +  kba'  puts  ab  +  k  in  the  form  ab  +  aj  +  bi  +  ij.  So  coset 
multiplication  in  Boolean  rings  is  just  set  multiplication.  This  unusual  fact  suggests  that 
perhaps  set  addition  and  multiplication  are  appropriate  operations  on  any  pair  of  elements 
of  ft  | ft.  Similar  remarks  hold  for  the  set  operations  '  and  V  on  ft  |ft. 

32  Conditional  logical  operations 

First  we  will  show  that  ft|ft  is  closed  under  the  set  operations  ',  +  , 

multiplication  or  A,  and  V.  This  will  give  us  an  "algebra'’  of  conditional  events,  and  its 
properties  will  be  subsequently  investigated.  It  is  convenient  to  note  first  that  for  ideals  7 
and  J  of  a  Boolean  ring  ft,  the  product  IJ  =  [ij :  i  e  /,  j  e  J }  is  indeed  an  ideal. 
Clearly,  I  c\J  is  an  ideal,  and  I  nJ cIJ.  For  x  in  / r\J,x  =  x  x  is  in  IJ,  so  IJ  = 
I  c\J.  Since  sums  of  ideals  are  ideals,  the  following  theorem  then  implies  that  sums, 
products,  and  disjunctions  V  of  two  cosets  are  cosets. 

Theorem  1.  Let  ft  be  a  Boolean  ring  and  let  I  and  J  be  ideals  of  ft.  Then 

(7)  (a  +  7)'  =  (a'+7), 

(2)  (a  + 1)  +  (b  +  J)  =  (a  +  b)  + 1  +  J, 

(3)  (a  +  I)(b+J)  =  ab  +  bI  +  aJ  +  IJ, 

(4)  (a  +  I)  \(b  +  J)  =  a%  +  b'l  +  a'J  +  IJ. 

Proof.  For(l), 


(a  +  I)'  =  {{a  +  i)'  i  e  /}= 
{(/  +  a)  +  i :  i  e  /}  =  a'  +  I. 


For  (2), 


(a  + 1)  +  (b  +  J)  =  [a  +  i  +  b  +  j :  i  e  /,;  e  J]  = 
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{(a  +  b)  +  i+j :  i  e  I,jeJ }  =  (a  +  b)  +  I  +  J. 

Part  (3)  is  more  difficult.  An  element  in  (a  +  I)(b  +  J)  is  of  the  form 

(a  +  i)(b  +  j)  =  ab  +  bi  +  aj  +  ij, 

which  is  clearly  in  ab  +  bl  +  aJ  +  //.  The  other  inclusion  is  the  difficult  part.  First,  we 
will  establish  it  for  principal  ideals.  So  let  I  =  Rx  and  J  =  Ry.  An  element  in 
(a  +  Rx)(b  +  Ry)'  is  of  the  form 

(a  +  rx)(b  +  sy)  =  ab  +  asy  +  brx  +  rsxy, 

and  an  element  of  ab  +  bRx  +  cRy  +  RxRy  is  of  the  form 

ab  +  btx  +  auy  +  vxwy. 

Let  z  =  vwxy  +  txb  +  uya  +  ab.  Then  letting 

r=  (z-a)xy-r  rx(2-y), 
and 

s  =  ( 1  -  b)xy  +  uy{l  -  x) 

puts  ab  +  asy  +  brx  +  rsxy  in  the  form  ab  +  btx  +  auy  +  vxwy.  This  is  a  bit  tedious 
but  straightforward  to  check.  Thus  we  have 

(a  +  I)(fi  +  J)  =  ab  +  bI  +  aJ  +  IJ 

for  principal  ideals.  For  arbitrary  ideals  I  and  J,  we  need 

ab  +  bl  +  aJ  +  IJ  c  {a  +  I)(b  +  J) 

That  is,  we  need  an  element  of  the  form  ab  +  bu  +  av  +  wx  to  be  of  the  form 

ab  +  bi  +  aj  +  ij, 

where  i,  u,  and  w  arc  in  /,  and  j,  v  and  x  are  in  J.  Now 

Ru  +  Rw  =  R(u  +  w  +  inv), 

as  is  easily  checked,  and  similarly  Rv  +  Rx  =  R(v  +  jc  +  vx).  Let 

e  =  u  +  w  +  uw 


and 
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f=v+x+  vx. 

Then  ab  +  bu+  av  +  wx  is  ir.  ab  +  bRe  +  aRf  +  ReRf,  which,  by  the  principal,  ideal 
case,  is  (a  +  Re)(b  +  ?*),  and  which,  in  turn,  is  contained  in  (a  +  I)(b  +  J).  This  proves 
part  (3). 

For  (4)  we  use  DeMorgan’s  laws  and  (3).  We  have 

(a  +  I)  V  (b  +  J)  =  /( a '  +  /)(&'  +  J)]' 

=  1  +  a'b'  +  b'l  +  a'J  +  IJ  =  a.%  +  b'l  +  a'J  +  IJ.  □ 

Several  comments  are  in  order.  We  have,  fcr  example,  the  formula 
(a  +  I)(b  +  J)  =  ab  +  bl  +  aJ  +  IJ. 

There  is  no  question  of  the  product  ( a  +  I)(b  +J)  being  we'll  defined.  It  is  just  the 
product  of  the  two  sets  a  +  I  and  b  +  /.  If  representatives  are  changed,  that  is,  if  a  is 
replaced  by  x  and  b  by  y  such  that  a  +  I  =  x  + 1  and  b  +  J  =  y  +  J,  then 

(a  <-  l)(b  +  J)  =  (x  +  I)(y  +  J)  -  xy  +  yl  +  xJ  +  IJ. 

Similar  remarks  hold  for  the  other  operations.  Since  addition,  multiplication  and 
disjunction  on  R  are  commutative  and  associative,  their  extensions  to  subsets  of  R  are 
commutative  and  associative.  Multiplication  and  disjunction  are  also  idempotent,  that  is, 
xx  =  x  =  x  V  x.  In  particular,  these  three  binary  operations  on  the  set  of  all  cosets  of  R  are 
both  commutative  and  associative.  Thus  we  can  perform  these  operations  on  any  (finite) 
number  of  cosets  with  the  result  independent  of  order  or  association. 

The  operations  ',  +,  V,  and  A  or  multiplication  were  defined  by  extending  the 
corresponding  operations  of  R  to  subsets  of  R.  Back  in  R,  the  operations  satisfy  the 
following  relations:  ■ 

(1)  x'  =  !+z, 

(2)  x  +  y  =  x'yVxy', 

(3)  (xy)'  •■=  V  y'  and  (x  V  y)'  -  x'y ', 

(4)  xiy  V  z)  =  xy  V  xz, 

(5)  xV(yz)=  (xVy)(xVz), 

(6)  x(y  +  7)  =  xy  +  xz, 

(1)  .xVy=x  +  y  +  xy. 

For  cosets  of  R,  only  (1)  through  (5)  hold,  and  those  are  easily  checked  using  the 
theorem  above.  We  have  already  noted  that  OeMorgcjfs  laws  (the  properties  in  (3))  hold 
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for  any  subsets  of  R.  An  example  of  the  failure  of  (6)  for  cosets  of  principal  ideals  is 
given  below,  and  (7)  does  not  hold  for  the  cosets  a  +  R  and  1  +  Ra' t  where  a*  0.  In 
that  case,  we  have 


{a  +  R)V  (1  +  Ra')  =  a  V 1  +  Ra', 

(a  +  R)  +  (1  +  Ra ')  +  (a  +  R)(l  +  Ra')  =  aVl+R, 
and 

a  V 1  +  ito' t *  a  V  i  +  i?, 

since  a  *  0. 

Finally,  Theorem  1  holds  for  commutadve  von  Neumann  rings  with  little  change  in 
the  proof.  See  Chapter  8. 

We  turn  now  to  specializing  these  results  to  the  case  where  the  ideals  are  principal. 
In  that  case,  we  have  the  three  binary  operations,  and  the  unary  operation  '  on  R\R.  We 
now  change  to  the  notation  (a\b)  for  a  +  Rb'.  The  following  theorem  states  the  basic 
facts  about  the  operations  on  the  space  R\R  of  principal  ideals. 

Theorem  2.  The  following  hold. 

(7)  (a\b)' =  (a' \b), 

(2)  (a\b)  +  (c\d)  =  (a  +  c\bd), 

(3)  (a\b)(c\d)  =  {ac\a'b  V  c'd  V  bd), 

(4)  ( a\b )  V  (c\d)  =  (a  V  c\ab  V  cdV  bd). 

Proof.  The  proof  of  (1)  is  easy.  For  (2), 

(a\b)  +  (c\d)  =  (a  +  Rb')  +  (c  +  Rd')  =  a  +  c  +  Rb'  +  Rd'. 

Now  we  have  observed  in  the  proof  of  the  previous  theorem  that  Rx  +  Ry  -  R(x  V  y). 
Thus 

Rb'  +  Rd'  =  R{b'Vd')  =  R{bd)' . 

For  (3), 

(a\b)(c\d)  =  ac  +  cRb'  +  aRd'  +  Rb'Rd', 

using  (3)  of  the  previous  theorem.  We  need  the  ideal  Rb'c  +  Rad'  +  Rb'd'  to  be  the 
principal  ideal  R(a'b  V  c'd  V  bd)' .  It  is  the  principal  ideal  R(b'c  V  ad'  V  b'd').  It  is 
routine  to  show  that 


(fl'hV  d<  d)'  =  b'c  V  ad'  Vb'd'. 
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For  (4), 


(a|b)  V  (cjd)  =  flVc  +  c'Rb'  +  a'Rd '  +  Rb'd', 
using  (4)  of  the  previous  theorem.  We  need  the  ideal 

Rb'c'  +  Ra'd'  +  Rb'd' 

to  be  the  principal  ideal  R(ab  V  of  V  bd)\  and  it  is  the  ideal  R(b'c'  V  a'd'  V  b'd'). 
Again,  it  is  routine  to  check  that 

(ab  V  cd  V  bd)'  =  b'c*  V  a'd'  V  b'd'.  □ 

Note  that  (0|2)  is  the  zero  of  R\R,  that  is,  is  the  additive  identity,  and  that  (2 1 1)  is 
the  multiplicative  identity.  That  is,  (0  j  2)  is  the  only  element  such  that 

(0|2)  +  (a|b)  =  (a|b) 

for  all  (a  |  b),  and  (2 1 2)  is  the  only  element  such  that  for  all  (a  |  b ), 

(l\l)(a\b)  =  (a\b). 

Indeed, 

(0\l)  +  (a\b)  =  (0  +  a)\l-b  =  a\b, 
and 

(l\l)(a\b)  =  flj(2'2  V  a'b  V  b)  =  a\b. 

If  (x|y)  were  another  zero,  then 

(0|i)  +  (^|}’)  =  (0!2)  =  (xly). 

Similarly,  (2 1 2)  is  the  only  multiplicative  identity  for  R  | R. 

Elements  in  R  | R  do  not  have  negatives,  in  general.  If 

(a\b)  +  (c|d)  =  (a  +  c)\bd  =  ( 0\1 ), 

then  bd  =  2,  so  b  -  d  =  1.  So  the  (a  |&)  with  negatives  are  exactly  the  (a|2),  whose 
negative  is  itself.  Further,  multiplication  of  sets  does  not  distribute  over  addition  of  sets, 
even  for  cosets  of  principal  ideals.  For  example, 

(2 1  b)((l  |  d)  +  (2 1/))  *  (2 1  b)(l  |  d)  +  (2 1  b)(l  l f), 
the  first  being  (0\df),  and  the  second  being  {0\bdf).  Just  pick  b ,  d,  and  /  so  that 
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df*bdf.  However,  multiplication  does  distribute  over  V,  and  V  over  multiplication,  as 
we  have  observed  above  for  any  cosets. 

In  any  case,  the  "algebra"  R\R  is  far  from  being  a  ring  under  the  operations  of 
multiplication  and  addition.  It  does,  however  contain  isomorphic  copies  of  all  the  RIRb', 
since 


(a\b)  +  (c\b)  =  (a  +  c)\b, 
and 

(a\b)(c\b)  =  (ac\b). 

The  operations  in  R\R  have  many  interesting  properties  and  interrelations.  We 
record  some  of  the  more  fundamental  ones  here.  Their  proofs  are  straightforward.  In  the 
following,  we  will  use  just  x  for  the  element  (x|i)  in  R\R. 

Theorem  3.  (Bayes)  Let  a\  +  ax  +  .  .  .  +  a^  =  1  be  a  partition  of  1.  In  particular,  the  flj 
are  mutually  disjoint.  Then  for  b  in  R, 

(2)  b  =  (b\af)ai  +  (b^^  +  . . .  +  (6|fln)^n. 

(2)  (aj|&)  =  (((6|oj)flj)|&), 

(3)  (ai\b)b  =  (b\ai)ai  =  ap, 

(4)  (flj|h)=  ((b\a-)ai)\((b\ai)a1  + (b\a£a2+ . . .  + (b^On).  o 

In  particular,  from  (4)  we  get 

=  (b\a)a  +  (b\a')a' 
and 

(a\b)  =  ((b\a)a)\((b\a)\a+(b\a')a'). 

Recall  that  logical  (material)  implication  b  -»  a  in  R  is  defined  to  be  b'Va.  We 
denote  (b  -*  a)(a  -*  b)  by  a  b.  These  operations  extend  to  R\R  in  the  same  manner  as 
the  others.  We  define 


(c\d)  ->(a\b)  =  {y-tx.-ye  (c|d),xe  (a\b)), 
and 

c| d)  *-»  (a|h) 

in  the  obvious  way. 

In  the  following  theorem,  pans  (1)  through  (5)  give  connections  of  V  and  A  with 
logical  implication,  parts  (6)  and  (7)  are  absorbing  properties,  while  pan  (9)  is  a 
decomposition  property.  Again,  the  verifications  are  straightforward. 
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Theorem  4.  The  following  hold. 

(2)  b-*a  =  (a\b)  V  b'  =  (b' \a')  V  a, 

(2)(a\b)  =  ((b-*a)\b)  =  (b-*a)(b\b), 

(5)  (a\b)  =  (b'\a')(P\0)nb'  -a), 

(4)  (c|d)  -+  (a\b)  =  (cjd)'  V  (a|6)  =  ((cd -♦  aiOKc'tf  V ab  V  6d)), 

(5)  (c|d)  -.(a|*>)  =  ((c|d)  -  (a|fe))((a|&)  -  (c|d)) 

=  {(ab^cd)\bd)=((a\b)  +  (c\d)Y, 

(6)  (a|&)  =  (a|2>)((fl|2>)V(c|d)), 

(7)  (a\b)  =  (a\b)V(a\b)(c\d), 

{8)  (a\b)  =  (a\l)  +  (0\b).  □ 

3.3  An  order  relation  and  related  concepts 

The  Boolean  ring  R  has  a  partial  order  <.  given  by  a  <,  b  if  ab  =  a.  Being  a 
partial  order  means  that  £  is  reflexive,  anti- symmetric,  and  transitive.  That  is,  a  £  a,  a  &  b 
and  b  £  a  imply  that  a-b,  and  finally,  a<xb  and  b<.c  imply  that  a<,c.  The  partial  order 
does  respect  multiplication  and  V,  in  the  sense  that  if  a  £  b,  then  ac  <,  be  and  a  V  c  £ 
bye.  Further,  a<.b  implies  that  b'  £  a'.  These  properties  are  trivial  to  check. 

We  now  define  a  partial  order  on  R  \  R  in  the  analogous  way,  and  note  some  of  its 
properties.  In  particular,  it  will  extend  the  partial  order  on  R,  identifying  R  with  the 
elements  of  the  form  (r|2).  Note  that,  in  his  discussion  on  qualitative  probability.  Savage 
(1972,  p.  44)  mentioned  the  lack  of  qualitative  counterpart  of  P(a\b)  <P(c\d).  It  is 
necessary,  even  from  a  qualitative  viewpoint,  to  compare  "interconditionals,"  that  is, 
conditionals  with  different  antecedents.  See  also  Koopman  (1940),  and  our  Chapter  5. 

Definition.  For  (a\b),  (c|d)  6  2?|2?, 


if 


(a\b)  <>  (c|d) 
(a\b)  =  (a\b)(c\d). 


The  relation  £  is  indeed  a  partial  order  on  R  | R.  Since 

(a\b)(a\b)  =  ( a2\a'b  V  a'b  V  IP-)  =  (a\b), 

we  have  (a  |  h)  <  (a  1 6),  so  <  is  reflexive.  If  (a|6)  =  (a\b)(c\d)  and 
(cjd)  =  (c\d)(a\b),  then  certainly  (a\b)  =  (c|d),  so  that  <  is  symmetric.  Finally,  to  get 
transitivity  for  <  if  (a\b)  =  (a\b)(c\d)  and  (c|d)  =  (c\d){e\f),  then 
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{a\b){e\f)  =  {(a\b){c\d)){e\f)  = 

{a\b){{c\d){e\f)>  (a\b){c\d)  =  (fl|6). 

The  partial  order  above  depends  only  on  the  multiplicaiton  in  R  |/2  being  idempotent, 
commutative,  and  associative.  Finally,  it  should  be  noted  that  if 

{a\b)Z{c\d), 

then 

(a\b)  V  {e\f)Z{c\d)  V  (<?[/) 
and 

{a\b){e\f)<>{c\d){e\f), 

while  it  is  not  true  that  (a\b)  £  (c\d)  implies  that 

(a\b)  +  {e\f)<,{c\d)  +  {e\ f). 

We  now  give  some  useful  alternate  conditions  equivalent  to  being  <L 

Theorem  1.  The  following  are  equivalent,  and  hence  are  all  equivalent  to  {a\b)<,  {c\d). 

(Z)  {a\b)  =  {a\b){c\d), 

(2)  {a\b)'Z{c\d)', 

(3)  ab£cdandc'd£a'b, 

(4)  {c\d)  =  {c\d)V{a\b). 

Proof.  First  we  prove  that  (1)  implies  (3).  If  (a\b)  =  {a\b){c\d),  then 

(i a\b )  =  {ac\a'b  V  c'd  V  bd), 

and  we  have 

ab  =  ac{a'b  V  c'd  V  bd)  =  abed, 

so  that  ab  £  cd.  Also  b  =  a'b  V  c'd  V  bd,  whence  c'd  <,  b,  and  so  ac'd  <  ac'b.  But 
ab<cd  gets  ac'  =  0 ,  so  ac'd  =  0.  Thus  c'd  £  a',  and  already  we  have  c'd  £  b.  Thus 
c'd  <  a'b,  and  so  (1)  implies  (3).  Assume  (3).  Then  ab<,cd  and  c'd  <  a'b.  To  get 

(a\b)  =  {a\b){c\d)  =  {ac\{a'b  V  c'dvbd), 

we  need  first  that  ab  =  ac{a'b  V  c'd  V  bd).  The  last  is  acbd,  which  is  indeed  ab  since 
ab  <  cd.  Finally,  we  need  b  =  {a'b  V  c'd  V  bd).  Now 


{a'b  V  c  fif  V  bd)  =  {a’  V  d)b  V  c'd. 
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But  c'  V  d  £  b,  and  abd'  £  cddf  =  0,  so  b  £  a'  V  d.  It  follows  that  (3)  implies  (1). 

Part  (2)  is  equivalent  to  (1),  using  (3),  and  (4)  is  equivalent  to  (1)  using  DeMorgan’s 
laws  and  (2).  □ 

Note  that  (3)  implies  that  £  is  monotone  in  the  first  argument,  that  is,  (a\b)  ^  (c|b) 
if  a  £  c.  More  generally,  if  (a|b)  £  (c\d)  then  (a\b)  £  ((c  V  x)\ d),  as  follows  readily 
from  (3).  This  is  not  true  for  the  second  argument.  For  example,  (a\b)  and  (a\bc)  are 
not  comparable,  in  general.  It  is  not  true  that  ab  £  abc ,  so  (a |b)  is  not  £  (a\bc),  and  it 
is  not  true  that  a'b  £  a' be,  so  that  {a\bc)  is  not  £(a|b). 

t 

Theorem  2.  The  following  hold. 

(1)  0£ab<,{a\b)<>{b->a)<.l. 

(2)  ab£(a\b)(b\a)Z(a<-*b). 

(3)  Ifa\  <,ai<...  £  fln,  then 

(Oil a^{a2\ai)  .  .  .  (%-2|fln-l)(fln-l|^n)  =  (flllfln)- 

(4)  (a\bc)(b\c)  =  (ab\c).  o 

Items  (1)  and  (2)  above  give  some  connections  between  material  implication  and  <, 
with  (2)  being  an  immediate  consequence  of  (1).  Items  (3)  and  (4)  are  called  "chaining" 
conditions,  and  (4)  is  a  consequence  of  (3). 

It  is  possible  to  give  a  formal  characterization  for  our  operations  •  and  V  on  R\R. 
A  systematic  investigation  of  the  rational  of  our  operations  will  be  given  in  Sections  3.4 
and  3.5. 

Theorem  3.  The  mapping  <p  :  -♦  R  defined  by  <p(a\b)  =  b  -*  a  =  b'  V  a  is  a 

y,t\)-homomorphism  from  R  |/?  onto  R.  That  is, 

q{(a\b)  V  (c\d)]  =  <p(a\b)  V  cp(c\d) 
and 

(p[(a\b)(c\d)]  =  (p(a\b)(p(c\d). 

Proof.  First,  <p  is  well  defined,  and  clearly  <p  maps  R\R  onto  R.  Now 
<p[(a|b)  V  (c\d)]  =  <p(a  V  c|a  V  c  V  bd)  =  (a  V  c  V  bd)'  V  a  V  c, 

taking  a  <  b  and  c  <  d  without  loss  of  generality. 


V  <p(c\d)  =  ( b '  V  a)  V  ( d '  V  c). 
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Thus  we  need 

(aV  cV  bd)'  V  a  V  c  =  ifl'c'ib'  V  d'))  V  fl  V  c  =  Z>V  V  a'd'  V  a  V  c 
to  be  &'  V  a  V  d'  V  c,  which  it  clearly  is.  Similarly, 

#(a|6)  A  (c|d)]  =  <p(aj&)  A  <p(c|d).  -  □ 

Theorem  4.  Let'  o  and  u  be  any  operations  on  i? ji?  extending  •  and  V  on  R. 
Suppose  that  the  mapping  given  by  (a\b)-*b'a  is  a  (o,  \J)-homomorphxsm.  Suppose 
further  that 

(a\b)o(c\d)  =  (abcd\a(a,b,c,d)) 
and 

(a\b)  u  (c\d)  =  (ab  V  cd\ p(a,b,c4)), 

where  abed  <  a(a,b,c,d)  and  (abV  cd)<,  (3(a,b,c,d).  Then  o  =  •  and  u  =  V. 

Proof. 

q>[(a\b)o(c\d)]  =  tp(abcd\a(a,b,c,d))  = 

( b '  V  a)(d'  V  c)=  a(a,b,c,d)'  V  abed  = 
a(a,b,c,d)'+  abed . 

Let  r  =  a'b  V  c'd  V  bd.  Then 

(i b '  V  a){d'  V  c)  =  r'  +  rac  =  r'  +  abed, 
whence  a{a,b,c,d)'  =  r.  Thus  o  =  •.  Similarly, 

<p[(a\b)  u  (c[d)]  =  f$(a,b,c,d)'  +  abV  cd  = 
b'  VaVd'  Vc  =  b'  VQbVd'  V  cd  =  b'  Vd'  VabVcd  = 

(b'  V  d'XabYicd)'  +  abScd, 
the  last  two  summands  being  disjoint.  It  follows  that 

/ %a,b,c,d)  =  {(b'  V  d'){abY{cd)'Y  =  bd  V  abed, 


and  that  u  =  V. 


□ 
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Other  operations  for  combining  evidence 

For  inference  purposes,  it  is  sometimes  appropriate  to  Combine  several  pieces  of 
conditional  information,  that  is,  conditional  events,  using  appropriate  operations.  If  two 
conditional  events  ( a\b )  and  (c|d)  arise  from  the  same  Boolean  ring,  then  we  have 
various  ways  to  do  that  now:  multiply  them,  or  use  V,  or  use  +  in  R|R,  or  use  other 
operations  in  R\R.  Then,  for  example,  given  a  probability  measure  P  on  R,  one  could 
calculate  the  probability  of  the  resulting  conditional  event.  But  what  if  the  events  a  and  b 
came  from  the  Boolean  ring  R,  and  c  and  d  came  from  the  Boolean  ring  S?  How  do 
we  get  a  single  conditional  event  capturing  the  essence  of  the  two  conditional  events  (a\b) 
and  (c|d)?  One  way  is  to  do  it  is  as  for  ordinary  events.  If  R  and  S  are  Boolean  rings, 
then  the  Cartesian  product  R  x  S  =  {(/vs)  ;  r  e  R,  s  e  S]  is  a  Boolean  ring  under  the 
componentwise  operations.  That  is,  just  operate  componentwise.  Now  if  r  is  an  event  in 
R  and  s  is  an  event  in  S,  then  (/vs)  is  an  event  in,  that  is,  is  an  element  of,  the  Boolean 
ring  RxS.  Since  RxS  is  a  Boolean  ring,  we  can  form  (R  x  S)\(R  x  S).  The  objects  of 
interest  are  the  two  conditional  events  (a|b)  and  (c  |  d),  or  the  pair  [(a|b),  (c\d)],  with 
a  and  b  in  R,  and  c  and  d  in  S,  say.  This  pair  is  an  element  of  the  set 

(R\R)x(S\S)={Qc,y):xeR\R,yeS\S} 

of  all  pairs  of  R\R  and  S\S.  But  this  set  is  in  natural  one-to-one  correspondence  with 
(R  x  S)  |  (/?  x  S)  via  the  mapping 

[(a\b),  (c|d)]-((c,c)|(M)- 

The  upshot  is  that  the  pair  (a|b)  and  (c|d)  of  conditional  events  is  associated  with  a 
conditional  event,  namely  one  in  the  space  (R  x  S)\(R  x  S).  For  any  probability  measure 
P  on  R  x  S,  one  may  assign  the  probability  of  [(a|b),  (c\d)]  to  be  P[{a,c)\{b,d)],  which 
makes  sense. 

Another  way  to  combine  evidence  of  the  form  above  is  this.  Regard  the  Boolean 
rings  R  and  S  as  rings  of  subsets  of  Q*  and  0%,  respectively.  Let 

C  =  [a  x  b  :  a  e  R,  b  e  5}, 

that  is,  the  set  of  all  Cartesian  products  of  elements  of  R  by  elements  of  S.  Thus  each 
element  of  C  is  a  subset  of  x  or  C  c  9  (fij  x  Q2)>  the  BvX>Iean  ring  of  all 
subsets  of  x  Now  C  is  not  a  subring  of  the  ring  x  Q2),  as  can  be  seen  by 
observing  the  the  basic  relations  between  union,  intersection,  and  complement  are  given 
by  the  formulas 
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(ax  b)'  =(a'xfl2)u(n1xZ;/), 

(ax  b)  n  (c x d)  =  (ac x bd), 
and 

(a  x  b)  u  (c  x  d)  =  (a  x  ft)'  n  (c  x  d)'. 

However,  there  is  a  unique  smallest  subring  R  •  S  of  x  fly  containing  C,  namely 
the  intersection  of  all  those  subrings  containing  C  The  operations  of  the  ring  are  then 
just  just  the  usual  set  theoretic  operation  of  x  fly.  For  a,be  R  and  c,deS ,  we 
define 


(a\b)  x  (c\d)  =  [e  xf:  e  e  (a\b),fe  (c|d)}. 


But  observe  that 


(e  xf)  r\(bxd)  =  eb  xfd  =  abxcd . 

Hence 

ex/e  (ab  x  cd\b x d)  e  (R  •  S)\(R  •  S). 

That  is, 

(a\b)  x  (c\d)  =  (abx  cd\bxd). 

Note  that  if  P  is  a  probability  measure  on  R  •  S  then 

P[(a \b)  x  (c|d)]  =  P[(ab  xcd)\(bx  d)]. 

As  an  illustration  of  the  possibility  to  use  this  type  of  operation  x  among 
conditionals  in  the  problem  of  combining  evidence,  consider  the  well-known  "penguin 
triangle"  problem  in  AI,  as  discussed  for  example  in  (Pearl,  1988). 

Let 

/  =  flying  animals 
b  =  birds 
p  =  penguins, 

so  that  (f\b )  =  "birds  fly",  if  |p)  =  "penguins  do  not  fly".  For  an  analysis  of  this  type  of 
information,  see  (Zadeh,  1985).  It  is  appropriate  here  to  combine  the  peices  of  evidence 
(f\b)  and  (f  \p)  via  the  operation  x  among  conditionals.  This  is  in  line  with  familiar 
situations  in  statistics.  Now 


< f\b)x(f'\p)  =  (fbxf,p)\bxp ). 
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Thus  ?(fb  xfp\b  x  p)  should  be  close  to  1  for  any  reasonable  probability  P  on 
R  •  S. 

3.4  Connections  with  three-valued  logic 

So  far  we  have  studied  the  logical  operations  on  the  space  R\R  from  a  syntax 
viewpoint  In  this  section,  we  will  investigate  the  semantic  relation  with  three-valued 
logics.  To  that  end,  we  first  discuss  that  relationship  between  Boolean  algebras  and 
classical  two  valued  logic.  We  then  show  that  cm  analogous  relationship  exists  between 
i?  [i?  and  three-valued  logic.  Since  Boolean  polynomials  play  a  fundamental  role,  we 
begin  with  a  discussion  of  them  and  their  properties  that  are  pertinent  to  our  situation. 
This  discussion  is  informal,  but  should  be  sufficient  for  our  purposes. 

An  elementary  Boolean  polynomial  in  the  n  variables  X\,  X&  ...» Xn  is  Y{X2..i n* 
where  Yj  =  Xx  or  X\  .  The  symbol  X-x'  should  be  thought  of  as  the  complement  of  Xv 
and  the  elementary  polynomial  should  be  thought  of  as  the  product,  or 

conjunction  of  the  Yj.  We  are  using  juxtaposition  to  indicate  this  conjunction,  rather  than 
inserting  the  conjunction  symbol  A.  There  are  TP  of  these  elementary  Boolean 
polynomials.  A  Boolean  polynomial  in  the  n  variables  Xh  X2, ...,  Xn  is  a  expression  of 
the  form  Ex\  E2. .  .V  E^,  where  the  E\  are  distinct  elementary  Boolean  polynomials. 
Thus  a  Boolean  polynomial  is  the  (formal)  disjunction  of  elementary  ones.  The  empty 
disjunction  is  allowed  and  is  denoted  0.  The  order  of  the  Ex  in  the  disjunction  is 
immaterial.  (As  an  aside,  the  set  of  Boolean  polynomials  in  the  n  variables 
X\,  X-2, ...,  Xn  may  be  thought  of  as  the  Boolean  algebra  of  all  subsets  of  the  set  of 
elementary  Boolean  polynomials  in  those  variables.) 

Here  are  some  examples  for  the  case  n  =  3.  There  are  23  elementary  Boolean 
polynomials,  namely 

X&&,  Xi'X2X2,  X1X2'X3,  XxX2X3\  Xi'X^Xi,  Xx’X2X3\  XxX2X3\  Xx'X2'X3. 
The  expression 

f=x1x&3  vxxx2'x3  vxxx&3'  vXi'Xfo' 

is  a  Boolean  polynomial  in  three  variables  having  four  elementary  terms. 

Generally,  a  Boolean  polynomial  in  the  variables  X\,  X-^  ....  Xn  is  any  expression 
formed  from  the  X\  using  A,  V  and  '.  For  example,  in  the  case  n  =  3, 

f=xlx2  V  xl  )(X2'X3  V  xxxj)  V  (X3'  V  xxyx2x3 

is  such  an  expression.  However,  manipulating  this  expression  as  if  the  Xt  were  elements 
of  a  Boolean  algebra,  and  A,  V,  and  '  were  the  usual  operations  on  it,  one  may  bring  f 
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into  the  form  of  a  disjunction,  or  union,  of  elementary  Boolean  polynomials,  and  this  form 
is  unique.  This  is  the  well  known  fact  that  every  Boolean  polynomial  can  be  written  in  its 
disjunctive  normal  form.  We  regard  any  two  Boolean  polynomials  the  same  is  they  have 
the  same  disjunctive  normal  form.  This  is  the  same  thing  as  requiring  that  two  are  the 
same  if  they  induce  the  same  Boolean  function  FP-^R.  The  disjunctive  normal  form  of  a 
Boolean  polynomial  is  not  usually  the  simplest  form  of  that  polynomial.  For  example,  if 
n  =  3,  the  Boolean  polynomial 

X&foVXfo'Xs'VX&z'Xs  VXiX*', 

which  is  in  disjunctive  normal  form,  may  be  more  simply  represented  as  X\.  Our  unions 
of  elementary  Boolean  polynomials  are  of  course  in  disjunctive  normal  form. 

The  following  proposition  is  clear. 

Lemma  1.  There  are  22*1  Boolean  polynomials  in  n  variables. 

The  connection  between  Boolean  polynomials  and  mappings  is  this.  If  f  is  a 
Boolean  polynomial  in  n  variables  and  R  is  a  Boolean  ring,  then  f  induces  a  map 
f:Rn-*R  by  evaluation.  Note  that  we  use  the  same  symbol  /  to  denote  a  Boolean 
polynomial  as  well  as  the  function  it  induces  on  any  Boolean  algebra  Ra.  This  is 
convenient,  and  should  cause  no  confusion.  For  example,  if  n  =  3  and 

f=XyX 2X3  VXfa'Xs  VXiW  VXj'-W, 
then  the  mapping  f.R?-*R  is  given  by  the  formula 

Aai>  a2f  ai)  —  aifl2a3  V  a\ai  a-$  V  a\a-ia$  V  a\  a^a-i  . 


Definition  1.  A  function  Rn-*R  is  called  Boolean  if  it  is  induced  by  a  Boolean  polynomial 
in  n  variables. 

There  are  a  couple  of  pertinent  elementary  facts  about  about  these  evaluation  maps 
and  elementary  polynomials.  First  notice  that  there  is  a  natural  one-to-one  correspondence 
between  elementary  polynomials  in  n  variables  and  n-tuples  of  0‘s  and  2’s.  For  n  =  3, 
XiXi'X^  corresponds  to  (2,  0,  2),  for  example.  An  elementary  polynormal  takes  the 
value  2  on  that  n-tuple  of  0‘s  and  2’s  to  which  it  corresponds,  and  takes  the  value  0 
on  all  other  n-tuples  of  0's  and  2’s.  Thus,  given  an  n-tuple  a  of  0’s  and  l’s  from  a 
Boolean  algebra  R,  there  is  exactly  one  elementary  Boolean  polynomial  e  in  n 
variables  for  which  e(a)  =  1,  and  that  e  has  value  0  on  all  other  such  n-tuples.  This 
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fact  is  the  basis  of  the  following  lemma. 

Lemma  2.  For  any  Boolean  algebra  R  and  any  n,  distinct  Boolean  polynomial  in  n 
variables  induces  distinct  maps  Rn  -»  R. 

Proof.  Suppose/  and  g  are  Boolean  polynomials  in  n  variables  and  f£g.  Then 
there  exists  an  elementary  polynomial  e  in  n  variables  such  that  e  is  a  term  of  /  and 
not  a  term  of  g  (say).  Now  e(a)  =  1  for  exactly  one  n-tuple  a  of  0’s  and  i's,  and 
f(a)  =  1  while  g(a)  =  0.  Thus  the  polynomials  /  and  g  induce  distinct  mappings  from 
i?1  to  R.  o 

Lemma  3.  Let  { 0 , 1}  be  the  two-element  Boolean  algebra.  Then  every  map 

f:  { 0 ,  l}n  -*  {0, 1}  is  induced  by  a  Boolean  polynomial  in  n  variables. 

Proof.  There  are  22n  Boolean  polynomials  in  n  variables  and  2^*  maps.  Use 
Lemmas  1  and  2.  □ 


The  previous  lemma  says  that  given  any  map 

g:{0,ir*{0,l}, 

there  is  a  Boolean  polynomial  /  in  n  variables  inducing  that  map  g.  That  Boolean 
polynomial  is  easy  to  construct,  given  g.  For  each  n-tuple  a  from  {0,1}  for  which 
g(a)  =  J,  there  is  exactly  one  elementary  Boolean  polynomial  e  for  which  e(a)  =  1.  The 
Boolean  polynomial  inducing  g  is  the  union  of  those  e.  Thus,  if 

g:{0,i}"-{0,i} 

is  presented  by  a  table 


(^li  "•!  #n)  0-7>  •••»  ^xi) 


then  the  Boolean  polynomial  inducing  it  is  the  union  of  the  elementary  Boolean 
polynomials  YiY2...Yn  where  Y\  =  X\  if  at  =  1  and  Y:  =  X-x  if  a\  =  0  and 
g(fl j,  aj, ...,  cQ~l.  For  example,  for  n  =  3,  if  the  table  is 
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0 

0 

0 

1 

0 

1 

1 

1 

then  the  Boolean  polynomial  is  ^ 


a2 

a3 

g(alt  a^  a3) 

0 

0 

0 

0 

1 

1 

1 

0 

0 

0 

0 

1 

1 

1 

0 

0 

1 

0 

1 

0 

1 

1 

1 

0 

l'X2'X3  VXfo'X/  VX&&'. 


Corollary.  If  R  is  any  Boolean  algebra  and  if  f:RP-*R  is  a  Boolean  function,  then  f  is 
completely  determined  by  its  action  on  [0, I}n. 


Let  R  and  S  be  sets,  and  let  t :  R  -» S  be  any  function.  Then  t  induces  a  function 
f1 :  Rn  -» S11  by  the  formula 

r2, . . . ,  =  (rCrj),  t(rf), ....  rOg). 

Now  suppose  that  R  and  5  are  Boolean  algebras  and  t  is  a  homomorphism.  That  is,  t  is 
a  function  such  that 

r(r  v  s)  =  tif)  v  r(s), 

t(r  A  s)  =  t(r)  A  t(s), 
and 

t (r')  =  t(ry 

for  r,  s  in  R.  In  particular,  r(0)  =  0  and  r(/)  =  1,  as  may  be  checked.  If  /  is  any 
Boolean  polynomial  in  n  variables,  then  since  t  is  a  homomorphism,  we  have 
immediately  that  for  (rlt  r2, rf)  e  IP, 

tftru  r2 . rf)  =Mr{),  tijH . t(rf)). 

This  may  be  rephrased  as  follows. 

Proposition  1.  Let  R  and  S  be  Boolean  rings  and  t:R-*S  a  homomorphism.  Let  f 
and  g  be  a  Boolean  polynomials  in  n  variables.  Then  the  diagram 
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Rn - £ - iR 

P  t 

Sn - 8. - >  5 


commutes  if  and  only  if  /  =  g. 

Proof.  Suppose  that  f=g.  Since  t  is  a  homomorphism,  we  have 
t(f(a]t ....  rin))  =  fflflj),  •••>  Kan))  =  ...»  <*„)). 

whence  the  diagram  commutes.  Now  suppose  that/ and  g  are  Boolean  polynomials  such 
that  the  diagram  commutes.  Since  t(0 )  =  0  and  >{1)  =  l,f  and  g  must  induce  the  same 
map  on  {0,.  I  )n  ,  which  is  contained  in  both  Rn  and  S*.  Thus  by  Lemma  2,  /=  g.  □ 


Now  we  specialize  the  results  above  to  the  case  where  S  is  the  two  element  Boolean 
algebra  {0,1}.  In  that  case,  the  homomorphism  t  is  called  a  truth  evaluation  on  R.  In  the 
diagram 

Rn - 1 - }R 


0,1}* 


0,1} 


if  /  is  any  Boolean  polynomial,  then  the  map  'Ff  induced  by  the  Boolean  polynomial  f 
is  called  the  truth  function,  or  truth  table  of  the  Boolean  function  /.  It  of  course  depends 
also  on  the  homomorphism  r,  that  is  on  a  truth  evaluation  on  R.  More  generally,  any 
function  ¥  :  {0,  l}n  -*  { 0 ,  1}  ;s  called  a  truth  function,  or  truth  table.  So  for  the  case 
S  =  {0,1},  the  result  may  be  stated  as  follows: 


Theorem  1.  Let  R  be  a  Boolean  algebra,  let  t  be  a  truth  evaluation  on  R,  and  let 
f :  Ru  -*R,be  a  Boolean  function.  Then  there  is  exactly  one  truth  funr  non 

such  that 

Vjor  =  tof 

namely  'Ff  =  /.  Conversely,  given  a  truth  function  XP,  that  is  any  mapping 
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{0,1}, 

there  is  exactly  one  Boolean  function  f:Rn-*R  such  that 

'Fofr  tof 

namely  that  given  by  the  Boolean  polynomial  f  inducing  VF.  In  particular,  there  is  a 
one-to-one  correspondence  bwtween  truth  functions  {OJ}n  -4  {0,1}  and  Boolean 
functions  Rn  -4  R.  □ 


Finally,  it  should  be  noted  that  above,  given  f  the  construction  of  *Ff  is 
immediate.  If  /  is  given  as  a  Boolean  polynomial,  then  there  is  no  computation  to  be 
made  for  the  construction:  'Ff  is  that  same  Boolean  polynomial.  In  any  case,  /  is 
determined  by  its  action  on  {0,  i}n  inside  R71,  and  given  any  function  from  {0,  7}n  to 
{0, 1],  we  have  specified  earlier  how  to  write  down  the  Boolean  polynomial  inducing  that 
function.  So  the  construction  of  ^Ff  from /is  routine.  Now  given  *F  :  {0,  i}n  -4  [0,  1 }, 
write  down  the  Boolean  polynomial  inducing  'F,  and  that  gives  the  unique  /  such  that 
*F  or"  =  /or.  So  not  only  do  the  requisite  / ’s  and  ’s  exist,  we  have  an  explicit 
procedure  for  constructing  them. 

We  are  now  going  to  generalize  the  results  above  to  the  conditional  case.  In 
particular,  R  |/?  will  play  the  role  of  R.  First,  we  must  decide  on,  and  develop  the  relevant 
properties  of,  the  analogs  of  Boolean  polynomials  for  the  conditional  case.  That  is,  which 
maps  (/?  |  R)11  -4  /?  |  R  should  play  the  role  that  Boolean  maps  Rn-*R  play?  Elements  of 
R  |  R  are  of  the  form  (a  |  b),  with  a,  b  e  R.  This  representation  is  unique  if  a  is  taken  to 
be  contained  in  b,  that  is  if  ab  =  a.  Any  mapping  (i?  |  R)n  -4  R  |  R  takes  an  element  of 
the  form  (a\  |&i,  a2|^2»  •  ♦  • » ^n|^n)  to  one  of  the  form  ( a\b ).  Again,  a  is  not  unique, 
but  ab  is,  and  thus  ab  should  be  a  function  of  the  2n  variables 


(a\b\,  02^2*  •  ♦  • » ttj}bn,  b\,  &2»  •  •  •  i  b^). 

We  require  that  this  function  be  induced  by  a  Boolean  polynomial  /  of  2n  variables. 
Similar  requirements  are  mandated  for  the  existence  of  a  Boolean  polynomial  g  of  2n 
variables  yielding  b.  But  the  situation  is  not  as  simple  as  in  the  classical  case.  Different 
Boolean  polynomials  can  induce  the  same  mappings  on  2/i-tuples  of  the  form 
Oh ,  t2, ,  r„,  Tn+1, . . . ,  r2n).  where  r\  <,  A  moment’s  reflection  shows  that  two  such 

polynomials  induce  the  same  mapping  on  such  2/i-tuples  if  and  only  if  their  elementary 
terms  are  the  same  except  for  those  of  the  form 


TiT2...Xi. 


YnY»l 


y.  * 
•  Ai+n 


...r*. 


These  are  precisely  those  elementary  terms  that  are  0  on  2n-tuples  of  the  form 
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(rb  r2, . . . ,  rn,  r^,  .  .  .  , r^  where  r,  £  r^.  We  call  a  Boolean  polynomial  in  2/i 

variables  reduced  if  it  contains  no  elementary  terms  of  the  form  displayed  above.  It 
should  be  clear  that  in  our  considerations  here,  only  reduced  Boolean  polynomials  need  be 
considered.  Thus  we  are  requiring  that  a  function  (R  \  R)11  |  R  be  given  by  two 

reduced  Boolean  polynomials  /  and  g  of  2n  variables.  The  polynomial  /  will  consist 
of  some  cf  the  elementary  terms  of  g,  so  that  /£  g  in  that  sense.  Such  a  pair  of  Boolean 
polynomials  will  be  denoted  f\g,  and  is  called  a  conditional  Boolean  polynomial  of  2n 
variables.  For  any  Boolean  algebra  R,  a  conditional  Boolean  polynomial  of  2n  variables 
induces  a  function  (R  |  R)11  -*  R  |  R  by  the  formula 

m(<h\bua2\b2,...,an\bj  = 

•  •  •  >  ^1*  ^2»  •  *  •  »  ^n)  |i>(^l^l>  ^2^2*  •  •  •  »  ^n^n>  ^1»  ^2*  •  •  •  »  ^n)* 


Lemma  4.  There  are  S3"  conditional  Boolean  polynomials  of  2n  variables. 

Proof.  A  conditional  Boolean  polynomial  is  of  the  form  f\g,  with  /  and  g 
reduced  and  the  elementary  terms  of  /  among  those  of  g.  The  number  of  reduced 
elementary  Boolean  polynomials  of  2n  variables  is  511.  To  see  this,  note  that  for  such  a 
polynomial,  there  are  2n  choices  for  its  first  n  entries.  For  those  entries  that  are  X{, 
there  is  only  one  choice  for  the  (i  +  n)-th  entry,  namely  X-v  For  those  entries  that  are 
Xf,  there  are  two  choices  for  the  (i  +  n)-th  entry,  namely  X-x  or  X\  .  So  there  are  2l 
elementary  terms  in  which  i  of  the  first  n  entries  are  Xfs.  It  follows  that  there  are 
indeed 

£”2*0  =  5* 

1=0 

elementary  reduced  Boolean  polynomial  2n  variables.  For  each  such  g  with  i 
elementary  terms,  one  has  the  choice  of  2‘  fs.  Thus  there  are 


possible  /|g's,  and  the  proof  is  complete. 


□ 


Any  Boolean  algebra  R  contains  the  two  element  Boolean  algebra  {0,1}.  We 
denote  this  two  element  Boolean  algebra  by  V.  Thus  inside  R\R  is 


V\V={(0\1),  ( 1\1)A0\0 )}, 
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and  so  inside  (/?!/?)"  is  (VlV)11.  Now  V\V  will  play  the  role  here  that  V  did  in  the 
classical  two  valued  case.  The  elements  (0Ji),  (i|2),  (0|0)  will  be  identified  with  the 
truth  values  0  (falsest  {true),  and  u  {undecided),  respectively. 

Lemma  5.  Let  R  be  any  Boolean  algebra.  Distinct  conditional  Boolean  polynomials 
induce  distinct  Junctions  {R\Rf-*R  |/?. 

Proof.  This  follows  from  the  observation  that  distinct  reduced  Boolean  polynomials 
induce  distinct  mappings  on  the  set  of  sequences  fa,  rj,  — ,  rn,  ,  r 2n)  of  0‘s  and 

i’s  with  rj^r^n. 

Lemma  6.  Every  function  (V|  V71  -*  V\V  is  induced  by  exactly  one  conditional  Boolean 
polynomial  in  2n  variables. 

Proof.  There  are  33"  such  functions.  Use  Lemmas  4  and  5. 

Definition  2.  A  function  (RlRy*  R\R  is  a  conditional  Boolean  function  if  it  is  induced 
by  a  conditional  Boolean  polynomial. 

For  a  conditional  Boolean  polynomial  f\g  of  2n  variables,  the  function 
it  induces  will  also  be  denoted  f\g.  Such  a  Boolean  function 
f\g  :  {R  I/?)11  -» R  |i?  is  determined  by  its  action  on  (V|  V)".  This  follows  from  Lemma  6. 

Let  R  and  S  be  Boolean  algebras,  and  let  t:R-*S  be  a  homomorphism.  Then  t 
induces  a  function  R  |i?  -*  S|S,  which  we  also  denote  by  t,  by  the  formula 

t{a\b)  =  m)\t{b)). 

Now  t  is  well  defined  since  t:R  -*  S  is  a  homomorphism  so  that  t{ab)  £  t{b).  The 
function  r:i?|/2-*5|5  induces  in  the  usual  way  the  function  f1 :  (/? J/?)"  -*  (S|S).  The 
following  proposition  generalizes  Proposition  1  to  the  conditional  case. 

Proposition  2.  Let  R  and  S  be  Boolean  algebras,  let  r  .*  7?  |i?  -»  S|S  be  induced  by  a 
homomorphism  from  t :  R -*  S,  and  lei  f\g  and  h\k  be  conditional  Boolean  polynomials 
in  2n  variables.  Then  the  diagram 
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(R\R)* - £1& - tR\R 

r°  t 

(5|5)n - hlK - J|  s 

commutes  if  and  only  if  f\g  -  h\k. 

Proof.  Suppose  that  f\g~h\k.  Then 

—  tififl\b\, ...,  Oftbfi,  b\t ...»  bf)\g{a\b\, ...,  a^b^,  b\,  ...,  bjf\ 

—  (l(f(flibx  ...,  Ojf?n,  b\ ,  ...»  &n))  I  1^1>  thf} ni  b\,  hn))) 

=  lMfl\b{), ...,  tia^bf),  t(b\j, ...»  tifff)  \g(.tia\bi), tiajtf),  tijbi), ...» t(bjf)^ 

=  (fm^MKh)), <fcA)| W)] 

=  .... 

and  the  diagram  commutes.  Conversely,  if  tlie  diagram  commutes,  then  since  t  is  the 
identity  on  V\V ,  viewed  as  contained  in  both  R\R  and  S\S,  the  conditional  Boolean 
polynomials  must  agree  on  V\  V,  whence  they  are  equal  by  Lemma  6.  D 

For  the  case  n  =  2,  a  conditional  Boolean  polynomial  f\g  gives  a  binary  operation 
on  R\R  and  one  on  S|S,  and  the  commutativity  of  the  diagram  just  says  that 
r:i?|jR-*S|S  is  a  homomorphism  with  respect  to  those  operations.  Thus  t  is  a 
homomorhism  for  any  binary  operation  induced  on  i?|i?  and  S|S  by  any  conditional 
Boolean  polynomial  f\g. 

Let  t  be  a  truth  evaluation  on  the  Boolean  algebra  R.  That  is,  t  is  a 
homomorphism  from  R  to  the  two  element  Boolean  algebra  [0, 1}  =  V. 

Definitions.  A  truth  evaluation  on  R\R  is  a  function  t:R\R-*V\V  induced  by  a  truth 
evaluation  t  on  R  by  the  formula 


t(a\b)  =  (t(ab)\t(b)). 
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Note  that  we  are  using  t  both  for  the  truth  evaluation  on  R  and  the  truth  evaluation  it 
induces  on  R |i?.  Viewing  R\R  as  containing  R,  the  truth  evaluation  on  R\R  induced  by  a 
truth  evaluation  t  on  R  is  an  extension  of  r  to  all  of  i?|2?.  Viewing  V|  V  as  a  subset  of 
R\R,  a  truth  evaluation  t  on  R |R  is  the  identity  function  on  V|  V. 


Since  V\V  =  {(0|2),  (2 | 2),  (0|0)}  has  three  elements,  each  conditional  event 
(a\b)  has  one  of  three  possible  truth  values,  (0|  1)  (false ,  or  0),  (2 12)  (true,  or  1),  and 
(0|0)  (undecided,  or  u).  The  truth  value  t(a\b )  of  (a\b)  is  thus  called  true  if 
r(ab)  =  1,  false  if  t(a'b)  =  1 ,  and  undecided  if  t(b')  =  2. 


We  pause  here  to  discuss  these  three  possible  truth  values,  their  justification, 
motivation,  and  history.  The  connection  of  conditional  events  and  three-valued  logic,  at 
an  informal  level,  appeared  in  r  eHnetti  (1964).  Following  his  discussion  on  conditional 
prevision  and  probability,  in  which  the  concept  of  conditional  events  was  mentioned 
(DeFinetti,  1974,  vol  I,  p.134),  he  brought  out  the  connection  as  follows.  In  the 
conditional  event  (a | b),  there  are  three  cases  to  consider,  ab,ab',  and  b',  corresponding 
to  "thesis",  "anti-thesis",  and  "anti-hypothesis",  respectively.  The  event  a  enters  the 
picture  only  throough  its  intersection  with  b.  Thus  (a\b)  can  be  written  in  its  "reduced" 
form  (ab \ b).  For  DeFinetti,  (a\b)  is  a  formal  object  with  no  strict  mathematical 
meaning.  He  stated  that  "one  might  consider  (a\b)  as  a  tri-event  with  values  (1  \  1)  =  2, 
(0|2)  =  0,  and  (0|0)  =  <j>,  where  2  =  true,  0  =  false,  and  (j»  =  void,  according  as  it  leads  to 
a  "win",  or  a  "loss",  or  a  "calling  off  of  a  possible  conditional  bet." 

A  similar  idea  appeared  in  Schay  (1968).  Generalizing  indicator  functions  of 
ordinary  events,  Schay  defined  conditional  events  (a\b)  as  functions,  defined  on  a 
sample  space  Q  and  taking  three  possible  values  (0,  2,  uj,  with  u  denoting 
"undefined".  This  approach  is  similar  to  the  one  taken  in  fuzzy  set  theory  (Zadeh,  1965). 
The  truth  space  {0,  2,  u)  is  standard  in  three-valued  logic.  (See  Rescher,  1969.) 
However,  in  the  calculations  to  be  presented  in  this  chapter,  DeFinetti’s  notation  will  be 
used,  and  we  will  justify  the  meaning  give”  to  the  symbols  (2 1 2),  (0 1 2),  and  (0 1 0).  (See 
also,  Boole,  1854,  pp.  89-97,  and  Hailperin,  1876,  pp.  123-137.) 

In  classical  two-valued  logic,  the  truth  values  of  a  Boolean  expression  such  as 
b  ->  a,  or  equivalently  b'  V  a,  are  determined  from  those  of  the  variables  a  and  b.  The 
truth  space  {0,1}  is  a  Boolean  ring  which  can  be  viewed  as  being  contained  in  every 
Boolean  ring  R,  so  that  the  determination  of  the  possible  truth  values  of  a  Boolean 
expression  is  equivalent  to  that  determination  for  the  case  R  =  (0,  1).  That  is,  the 
determination  of  the  possible  truth  values  can  be  made  by  substituting  only  the  values  0 
and  2's  into  the  expression.  Consider  now  a  conditional  event  ( a\b ).  It  is  not  a  Boolean 
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expression,  but  one  can  formally  apply  this  evaluation  process  to  get  the  possible  "truth 
values"  of  (a  j  b).  This  is  what  DeFinetti  did.  Doing  this  for  (a|fc)  yields  the  three 
possibilities  (I  |I),  (0\1),  and  (0,0)  =  (/ 10).  Using  our  modeling  of  conditional  events 
as  cosets  of  principal  ideals, 

(1  \1)  =  1  +  {0, 1)0  =  1  +  {0}  =  {!}, 

(0|i)  =  0  +  {0, 1)0  =  0  +  {0}  =  {0},  and 

(010)  =  0  +  {0, 1}1  =  0  +  {0, 1}  =  {0, 2}. 

The  first  two  we  identify  with  "true"  and  "false",  respectively,  but  there  is  a  third  possible 
"truth  value"  (0j0)  =  { 0,1 },  which  can  be  interpreted  as  "undecided"  since  we  cannot 
reasonably  choose  one  of  the  values  "true"  or  "false"  for  (a\b)  when  both  a  and  b  are 
0. 

Now  back  to  our  more  mathematical  truth  evaluations  t  V\V.  In  the 

diagram 


(R 


R )“■ 


(vi  v? 


Jk 


f\g 


R 


♦  VIV  , 


if  f\g  is  a  conditional  Boolean  polynomial  and  t  is  a  truth  evaluation  on  R\R,  the  map 
flyjg  induced  by  that  polynomial  on  (V|  V)“  is  called  the  truth  function  or  truth  table  of 

f\g.  It  of  course  depends  on  the  truth  evaluation  t.  More  generally,  any  function 
:  (V]  Vf1  -4  Vj  V  is  called  a  truth  function  or  truth  table.  Here  is  our  main  theorem  for 
the  conditional  case.  It  follows  immediately  from  the  previous  Proposition  2  and 
Lemma  6. 


Theorem  2.  Let  R  be  a  Boolean  algebra,  let  t  be  a  truth  evaluation  on  R\R,  and  let 
f\g  :  (/?|/?)R  -»  R\R,  be  a  conditional  Boolean  function.  Then  there  is  exactly  one  truth 
function 

|V 

such  that  * 

^ fig ofl  =  to<f\8 X 
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namely  =  f\g.  Conversely,  given  a  truth  function  *¥,  that  is,  any  mapping 

>P:(v|v)n-4V|vr, 

there  is  exactly  one  conditional  Boolean  function  f\  g  :  (R  ]/?)"  -» R  |i?  such  that 

*P  of1  =  to(f\g), 

namely  that  given  by  the  conditional  Boolean  polynomial  /|g  inducing  In  particular, 
there  is  a  one-to-one  correspondence  between  truth  functions  (V| V)11  -♦  V\V  and 
conditional  Boolean  functions  (R  |/?)”  -*R\R.  □ 


In  the  theorem,  given  f\g,  how  can  one  actually  construct  ^jg?  Given  'P  how 

can  one  actually  construct  /|g?  If  f\g  is  given,  it  is  almost  always  given  in  the  form  of 
a  conditional  Boolean  polynomial,  in  which  case  simply  take  'P^  =  /|g.  In  any  case, 

the  action  of  the  function  f\g  on  V\V  is  given,  and  that  action  determines  f\g.  So 
from  a  function  (V|  V)n  -*  V|  V,  we  need  to  construct  the  conditional  Boolean  polynomial 
inducing  it.  Thus  we  need  to  construct  two  Boolean  polynomials  inducing  two  given 
Boolean  functions  V211  -*  V.  We  have  seen  earlier  how  to  do  this  explicitly.  Now, 
conversely,  this  is  the  same  problem  as  constructing  from  'P  :  (V| V)n  -•  V\V  the  requisite 
f\g.  So  carrying  out  these  constructions  is  just  a  problem  in  constructing  Boolean 
polynomials  inducing  given  functions  V2"  -»  V.  We  will  have  occasion  to  carry  out  some 
of  these  constructions  in  Section  3.5  for  the  cases  n  =  1  and  n  =  2. 

In  case  n  =  2,  each  conditional  Boolean  polynomial  gives  a  binary  operation  on 
R\R,  and  in  particular  on  V\V,  and  we  have  a  one-to-one  correspondence  between  binary 
operations  (given  by  conditional  Boolean  polynomials)  on  R\R  and  (binary)  truth 
functions  on  V\V.  The  case  n  =  1,  of  course,  gives  a  unary  operation  on  R\R,  or  just  a 
mapping  from  R\R  into  itself,  and  there  is  a  one-to-one  correspondence  between  unary 
operations  on  R\R  (given  by  conditional  Boolean  polynomials)  and  unary  truth  functions 
on  V\V.  The  space  Vj  V  =  { (0 1 7),  (I  ( 1),  (0\0)}  is  called  the  truth  tpace.  We  sometimes 
label  its  elements  0,  1,  and  u  for  (0|I),  (I|i),  and  (O|0),  respectively,  thinking  of  0  as 
false,  1  as  true,  and  u  as  undecided.  Various  authors  have  defined  logical  connectives,  or 
operators  V,  A,  and  '  on  R\R,  and  there  are  several  well  known  sets  of  truth  tables  for  the 
truth  space  V\ V.  Given  logical  operators  V,  A,  and  '  on  R\R,  *here  are  corresponding  truth 
tables  for  them.  These  truth  tables  may  or  may  not  be  reasonable  ones  from  a  logical 
point  of  view.  It  is  typical  that  a  three-valued  logic  is  specified  by  giving  five  truth  tables, 
one  for  each  of  the  connectives  V,  A,  ',  and  *— » .  In  any  case,  truth  tables  for  them  give 
rise  to  algebraic  operations  on  R  |R,  and  with  these  operations,  may  or  not  be  an 
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interestL  6  or  tractable  algebraic  system.  This  one-to-one  correspondence  between  truth 
tables  (for  Vj  V)  and  operations  on  R\R  is  of  interest,  with  this  latter  structure  providing  a 
syntactic  home  for  a  given  three-valued  logic.  We  will  look  at  several  such 
correspondences  in  Section  3.5. 

In  discussing  conditional  Boolean  polynomials,  we  have  stuck  to  those  f\g  in 
reduced  form.  That  is,  /  and  g  are  Boolean  polynomials  in  normal  disjunctive  form 
with  no  terms  of  the  form 


Yu  Y2, , ,  ,  Xj,  •  •  •  »  Xjj,  •  •  .  ,  ]  XI  »  •  •  *  T?n  I 


and  every  elementary  term  of  /  is  one  of  g.  Usually,  a  Boolean  polynomial  can  be 
written  in  much  more  compact  form  than  its  normal  disjunctive  form.  For  this  reason,  and 
for  computational  purposes,  we  indicate  how  to  associate  a  conditional  Boolean 
polynomial  with  f\g  fox  any  Boolean  polynomials  f  and  g.  To  do  this,  just  put  /  and 
g  in  their  disjunctive  normal  forms,  discard  from  each  their  elementary  terms  of  the  form 
displayed  above,  and  from  /  those  elementary  terms  not  in  g.  This  last  step  is  the  same 
as  "intersecting"  /  with  g.  In  fact,  one  could  intersect  /  and  g  first,  and  then  put  fg 
and  g  in  their  normal  disjunctive  forms,  discarding  those  terms  of  the  form  displayed 
above.  This  gives  a  pair  f\g  in  reduced  form,  and  starting  from  any  pair,  it  should  be 
clear  that  it  is  associated  with  exactly  one  f\g  in  reduced  form.  Further,  any  pair  f\g 
induces  a  function 


f\g:(R\R)**R\R 

just  as  in  the  case  of  reduced  forms,  and  two  f\g's  induce  the  same  function  if  and  only  if 
they  have  the  same  reduced  form.  We  will  call  two  f\g's  equivalent  if  they  have  the 
same  reduced  form,  or  what  is  the  same  thing,  if  they  induce  the  same  mapping  just 
indicated. 

The  procedure  outlined  above  is  useful  in  verifying  that  two  pairs  f\g  are 
equivalent.  We  illustrate  with  an  example.  Let  n  =  2,  and  consider  the  two  conditional 
polynomials 


fti  =  (Xi'  VX2  VX3'X»')|CXi'X3  VXjX4  VX3X4  VX3'X„') 
and 


h\k  =  (X1X2VX1'X3VX2X3'  VX3'X4')[(Xi'X3  vxlxAvx1xf  vx3'V). 


Now  it  is  an  easy  calculation  to  get 
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fg=X1'X3VX &4VX3'X4' 
and 

kk  =  X{X^4r  V Xj'X3  V X2X3'  V X3'X4'. 

Still,  it  is  not  clear  at  all  that 

fg\g  =  (Xi'X3  VX3/X4/)|^i,X3  VX2Y4  7X3X4  7X3'X4') 

and 

hk \k  =  (X1X2X4  7X^X3  7X2X3'  7X3/X4')|(X1/X3  7XxX4  7X2X3'  7X3 'X4') 

are  equivalent  The  disjunctive  normal  form  of  fg=X i'X3  7  X2X4  7  X3'X4'  is 

X/X2X3X4  7X1'X2'X3X4  7X/X2X3X4'  7X1'X2'X3X4' 

7  X!X2X3X4  7  Xi'X^X*  7  XiX2X3'X4  7  Xi'X2X3'X4 

7X1X2X3'X4'  7X1'X2X3'X4'  7XiX2'X3'X4'  7X1'X2'X3'X4', 

which,  after  discarding  duplicate  terms  and  those  of  the  forms  X1WX3T  and  WX2YX/, 
becomes 

Xi  'X2X3X4  7  Xj  'X2'X3X4  7  X!  'X2'X3X4 ' 

VXiX&fa  7X1'X2X3'X4  VXi'X2'X3'X4'. 

Similarly,  the  normal  disjunctive  form  of  hk- XiX2X4  7  Xj 'X3  7 X2X3 '  7 X3 'X4 '  is 

X1X2X3X4  yX1X2X3'X4 

7 Xi 'X2X3X4  7X1'X2'X3X4  7X/X2X3X4'  7X1'X2'X3X4' 

7  XxX^i3fXA  7  Xi'X2X3'X4  7  XxXfc'Xt!  7  Xj'X^^' 

7X1X2X3'X4'  7X1'X2X3'X4'  7X1X2'X3'X4'  7X1'X2'X3'X4'. 

Again,  discarding  duplicate  terms  and  those  of  the  forms  XjW^'X  and  WXjYX 4'  yields 

X ^2X3X4  7  Xj  'X2X3X4  7  Xj  'X2'X3X4 

7X1'X2'X3X4'  7X1'X2X3'X4  7X1'X2'X3'X4', 

which  is  the  same  form  as  that  of  /g.  Similarly,  g  and  k  have  the  same  such  forms,  so 
that  /Jg  and  /t|£  are  equivalent  They  represent  the  same  conditional  Boolean 
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polynomials,  and  induce  the  same  conditional  Boolean  functions. 

In  summary,  the  situation  is  this.  There  is  a  one-to-one  correspondence  between 
(conditional)  logical  operations  on  R  JR  and  truth  functions  on  the  truth  space  {0,l,u}. 
Note  however  that,  unlike  the  case  of  R,  there  is  a  variety  of  three-valued  logics.  See,  for 
example,  Rescher  (1969)  for  background.  Also,  note  the  difference  with  the  Boolean  case: 
since  both  R  and  {0,1}  are  Boolean  rings,  truth  evaluations  are  specified  as 
homomorphisms;  the  situation  in  three-valued  logics  is  somewhat  different  Indeed,  as  far 
as  three-valued  logics  are  concerned,  all  logicians  insist  on  the  choice  of  some  system  of 
"truth  tables"  for  basic  connectives  between  implicative  propositions  without  syntax 
considerations.  This  is  not  surprising  since  the  concrete  space  R\R  of  implicative 
propositions,  as  a  mathematical  entity,  was  never  considered  at  the  level  of  Boolean  rings 
for  unconditional  propositions.  Now,  since  R\R  is  shown  to  be  the  space  of  all  cosets  of 
principal  ideals  of  R,  it  is  possible  to  investigate  its  algebraic  structures  induced  by 
semantic  considerations. 

In  the  case  of  R  |R  which  has  no  a  priori  algebraic  structure,  we  have  only  at  our 
disposal  truth  evaluations  r  :R|R  -4  {0, 1,  u]  defined  previously.  The  objective  is  to 
establish  an  analogous  commutative  diagram  for  the  conditional  case.  This  type  of 
diagram  will  provide  algebraic  structures  for  R\R  from  given  semantics  and  vice  versa. 
If  {0, 1}  is  the  truth  space  in  classical  two-valued  logic,  then  formally  ({0, 1}  |  [0, 1}) 
is  the  truth  space  for  elements  of  the  conditional  space  R\R.  Rom  the  above 
identification,  we  see  that  three-valued  logic  is  natural  for  conditional  events.  This  is  in 
line  with  earlier  considerations  of  DeFinetti  (1964)  and  Schay  (1968).  It  is  interesting  to 
note  that  the  symbols  (0\1),  (/|J),  (0\0)  appeared  also  in  Boole’s  Laws  of  Thoughts 
(Boole,  1854),  apparently  in  his  attempt  to  provide  a  disjunctive  normal  form  for  ratios  of 
propositions.  See  also  Hailperin  (1976). 

Another  constructive  proof  of  Theorem  2  will  now  be  given.  First,  in  view  of 
Stone's  Representation  Theorem,  we  regard  the  Boolean  ring  R  as  a  field  of  subsets  of 
some  set  Cl.  As  such,  truth  evaluations  can  be  expressed  in  terms  of  indicator  functions. 
Recall  that  the  generalized  indicator  function  of  (a  |  b),  for  a,  b  e  R,  is  defined  as: 

<P(a\b)  :Cl-*{0,u,  1} 

■  1  if  coz  ab 
<p(a|h)(<u)  =  0  if  co  z  a'b 
u  if  co zb'  . 

Assuming  a<b,(a\b)  partitions  Q.  as  a,  a'b,  b' ,  so  that  if  we  let 
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polynomials,  and  induce  the  same  conditional  Boolean  functions. 

In  summary,  the  situation  is  this.  There  is  a  one-to-one  correspondence  between 
(conditional)  logical  operations  on  i?j/?  and  truth  functions  on  the  truth  space  {0,1,  u}. 
Note  however  that,  unlike  the  case  of  R,  there  is  a  variety  of  three-valued  logics.  See,  for 
example,  Rescher  (1969)  for  background.  Also,  note  the  difference  with  the  Boolean  case: 
since  both  R  and  {0,1}  are  Boolean  rings,  truth  evaluations  are  specified  as 
homomorphisms;  the  situation  in  three-valued  logics  is  somewhat  different  Indeed,  as  far 
as  three-valued  logics  are  concerned,  all  logicians  insist  on  the  choice  of  some  system  of 
"truth  tables"  for  basic  connectives  between  implicative  propositions  without  syntax 
considerations.  This  is  not  surprising  since  the  concrete  space  RJR  of  implicative 
propositions,  as  a  mathematical  entity,  was  never  considered  at  the  level  of  Boolean  rings 
for  unconditional  propositions.  Now,  since  R  |R  is  shown  to  be  the  space  of  all  cosets  of 
principal  ideals  of  R,  it  is  possible  to  investigate  its  algebraic  structures  induced  by 
semantic  considerations. 

In  the  case  of  R  ji?  which  has  no  a  priori  algebraic  structure,  we  have  only  at  our 
disposal  truth  evaluations  t :  R\R  -*  {0, 1,  u}  defined  previously.  The  objective  is  to 
establish  an  analogous  commutative  diagram  for  the  conditional  case.  This  type  of 
diagram  will  provide  algebraic  structures  for  R\R  from  given  semantics  and  vice  versa. 
If  {0,1}  is  the  truth  space  in  classical  two-valued  logic,  then  formally  ({0, 1}  |  {0, 1}) 
is  the  truth  space  for  elements  of  the  conditional  space  R\R.  From  the  above 
identification,  we  see  that  three-valued  logic  is  natural  for  conditional  events.  This  is  in 
line  with  earlier  considerations  of  DeFinetti  (1964)  and  Schay  (1968).  It  is  interesting  to 
note  that  the  symbols  (0|2),  {1 1 !),  (0\0)  appeared  also  in  Boole’s  Laws  of  Thoughts 
(Boole,  1854),  apparently  in  his  attempt  to  provide  a  disjunctive  normal  form  for  ratios  of 
propositions.  See  also  Hailperin  (1976). 

Another  constructive  proof  of  Theorem  2  will  now  be  given.  First,  in  view  of 
Stone's  Representation  Theorem,  we  regard  the  Boolean  ring  R  as  a  field  of  subsets  of 
some  set  ft.  As  such,  truth  evaluations  can  be  expressed  in  terms  of  indicator  functions. 
Recall  that  the  generalized  indicator  function  of  ( a\b ),  for  a,  b  <z  R,  is  defined  as: 

cp{a\b) ;  ft  -*  {0,  u,  1} 

■  1  if  co  e  ab 
<p(<2jb)(<D)  =  0  if  co  €  a'b 
■u  if  co  6  b'  . 

Assuming  a<b,(a\b)  partitions  ft  as  a,  a'b,  b',  so  that  if  we  let 
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i  =  1, 2, ...»  n,  there  are  only  three  pairs  (0, 1),  (0, 0),  (i,  1)  for  each  (5^,  yp,  thus,  letting 

i  tf(SPrf  =  (U) 
jt=  o  if  (8V yp  =  (0, 1) 
if  (8ityp  =  (0,0)t 

and,  for 

1 =  dpi 2’  ‘"’in)’ 

w£a\ b)  -  Wj(a j  | b j)...Wj  (pn j bj , 

we  have 

a(a,b)=  V  0(5,, 5  y7, ...,  y  )w.(a\b)  =  V  w-(a|6) 

--  £{0,u,l)n  1  n  1  «  l-~  j*J(a)J--- 

where 

J(a)  c  {0,  u,  7}n  , 

/(a)  =  U :  ai8r  ...,  8n,  yp  ....  yj  =  1} . 

Note  that  j_  =  (jp  ...»  jj  with  j.  corresponds  to  (5-,  Define 

{0,u,  l}n-»  {0,  n,  1}  by 

1  if  jjJ(a) 

Vfi)=  0  if  uftaJnJtf) 
u  ifitflp) 

where  J  (a)  denotes  the  set-complement  of  /(a)  in  {0,  it,  I  )n  and  similar  notation 
applies  to 

Note  that  f(a\b)  might  have  another  representation  form,  say  (A (a,  6) |  f(a,  b)),  but 
then  a(a,  b)  n  f(a,  b)  =  A  (a,  b)  n  /3 (a,  b),  implying  that 


J(cc)nJ(p)  =  J(A)nJ(p), 


so  that  Yjr  is  well-defined. 

For  Yj-  defined  above,  (*)  holds.  Indeed,  for  (a\b)  e  (/?  1 7?)n  arbitrary  but  fixed, 
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=  l  if  and  only  if  &)|0(£»  £))(g>)  =  2  if  and  only  if  ©  e  a(a,  b) 

(assuming  a  £  ft)  if  and  only  if 


if  and  only  if 


for  some 


if  and  only  if 


©e  V  wla\b) 

pm  i-~ 

©e  Wj(a\b) 
leJ(a) 


©  e  Wj  (a.\bj),  Vz  =  1, 2, ...,  n 
(where  j_  =  ijp  -»y'n))  if  and  only  if 


9(az-|&p(©)  =  Vi  =  i,  2, n 


if  and  only  if 

2  =  V/J)  =  YjfaCBjI&jX®)*'  -» <P(nnI&rX©))  =  VjOPn(a|W<°))- 

The  argument  is  similar  for  <p(/(a|  &))(©)  =  0  ot  il 

Conversely,  if  {0,  u,  2}n-*  {0,  zz,  2}  is  given,  then  there  exists  a  unique 
Boolean-like  map  (a|  p) :  (R  (i?)11  -+  R |i?  such  that  (*)  holds.  Indeed,  it  suffices  to  take 

ala,  b)  =  V  7  wla\b), 

-  -  p\f‘o)  i-~ 


p(a,b)=  j  V  ,  wla\b) 
--  Jex/  aW  iO)  J-~- 


Several  remarks  are  in  order. 

(i)  Viewing  (a  |  b)  as  a  mathematical  entity  with  the  three  possible  values  0,  u,  ot  2, 
the  function  uniquely  associated  with  a  map  /;  (R |i?)n  -♦  R\R  is  precisely  the  "truth 
table"  of  /.  The  function  'Pf  is  completely  determined  once  /  is  specified.  The 
converse  is  also  true:  a  truth  table  *¥  will  uniquely  determine  a  "syntactic"  (mathematical) 
modeling  of  a  connective  on  R\R.  Moreover,  (*)  of  Theorem  2  expresses  the 
truth-functional  property  of  logic,  namely  truth  values  of  an  n-ary  connective  on  R\R 
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are  determined  from  those  of  the  components. 

(ii)  In  the  literature  of  three-valued  logic  (for  example,  Rescher,  1969),  one  usually 
considers  a  collection  of  sentences  S  in  which  each  sentence  s  can  be  either  true,  false, 
or  "undetermined"  (Lukasiewicz,  Bochvar,  Kleene).  The  algebraic  stu  ;ture  of  S  is  rarely 
specified.  Instead,  semantically,  five  truth  tables,  one  each  for.  A,  V,  ',  -* ,  and  ♦-»  are 
given.  Our  remarks  above  show  that,  given  such  a  system-  of  "truth  tables",  one  can 
explicitly  write  down  their  "syntactic"  counter-parts,  and  conversely. 

It  is  interesting  to  speculate  about  the  algebraic  analog  of  a  Boolean  ring  as  a  basic 
space  for  Lukasiewicz’s  logic.  That  is,  can  one  give  a  mathematical  representation  of  a 
sentence  $  in  S  in  such  a  way  that  as  an  algebraic  structure,  S  will  be  equipped  with  the 
basic  connectives  whose  truth  tables  are  given  in  advance?  As  we  shall  see  in  Section  3.5, 
one  such  mathematical  representation  for  S  is  our  conditional  extension  R\R  where  our 
logical  operations  introduced  in  Section  22  correspond  precisely  to  Lukasiewicz's  troth 
tables. 

(iii)  As  far  as  we  are  concerned  here,  the  easy  part  of  Theorem  2  will  serve  as  a 
way  to  discuss  the  "reasonability"  of  our  proposed  system  of  logical  operations  for 
conditional  events.  This  will  be  carried  out  in  two  steps.  First,  from  a  given  system  of 
operations  on  R\R,  one  proceeds  to  identify  their  associated  truth  tables  using  normal 
disjunctive  forms  of  Boolean  functions  and  the  explicit  construction  of  Ty  given  in  the 
proof  of  Theorem  1.  Next,  once  a  system  of  truth  tables  is  obtained,  one  looks  at  the 
names  of  the  connectives  involved  (say,  /  =  "and")  and  examines  their  troth  tables.  Since 
a  truth  table  of  a  given  connective  (in  natural  language)  should  reflect  the  common  sense 
meaning  of  that  connective,  any  "unreasonable"  truth  table  found  will  lead  to  the 
conclusion  that  its  corresponding  proposed  operation  on  R|R  is  "unreasonable".  This 
program  will  be  carried  out  in  Section  3.5  with  the  systems  of  logical  operations  on  R\R 
proposed  by  Adams,  Calabrese,  Schay,  and  by  us. 

The  other  part  of  Theorem  2,  namely  that  each  truth  table  in  three-valued  logic, 
corresponds  uniquely  to  an  operator  on  R\R,  is  useful  for  investigating  new  algebraic 
structure  of  /?|R. 

(iv)  The  above  three-valued  logic  viewpoint  can  be  used  to  formulate  the  concept  of 
realizations  of  conditional  events.  Let  R  be  a  (j-field  of  subsets  of  a  sample  space  Cl.  The 
generalized  indicator  function  <p  of  each  (a|b)  is  defmed  as  <p(ez | £>)(o>)  =  1  on  ab,  u  on  b\ 
and  0  on  a' b.  As  in  the  case  of  ordinary  events,  where  a  e  R  is  said  to  be  "realized"  if  the 
"outcome"  cos  a,  that  is,  if  <p(a | l)(co)  =  1,  conditional  events  can  possess  a  similar 
concept,  viewed  from  a  three-valued  logic  standpoint.  Recall  that  (a\b)  =  [ ab.b'Sa ].  If  co 
6  ab,  then  co  €  x  for  all  x  e  ( a\b ),  so  that  (<.|b)  is  "fully"  realized;  if  co  e  a'b,  then  cos  x 
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for  any  x  e  (a\b),  since  a'bx  =  0 ,  thus  (a\b)  is  realized  at  ’level"  0;  if  ©  6  b',  then  for 
each  x  e  (a\b),x  may  or  may  not  occur,  depending  on  whether  co  e  xb'  or  noL  If  it  is, 
then  we  can  interpret  the  realization  at  some  level,  for  example  at  level  P(a\b)  for  some 
probability  measure  P  on  R.  This  can  be  justified  by  the  consideration  of  a  random 
variable  X  defined  on  Q.  having  values  0  on  a' b,  1  on  ab,  and  P{a\b)  on  b',  and  noting 
that  E(X)  =  P{a\b). 

(v)  The  viewpoint  of  three  valued  logic  taken  here  should  not  be  confused  with  the 
three-valued  logic  associated  with  "conditional  forms’*  of  McCarthy  (1967)  which 
motivated  algebraic  investigations  referred  to  in  the  literature  as  "conditional  logic" 
(Guzman  and  Squier,  1990).  "Conditional  logic"  in  the  literature  sometimes  refers  to  the 
non-commutative  {regular)  extension  of  Boolean  logic  to  three  truth  values,  the  third 
denoted  u  and  standing  for  "undefined’’  or  "non-terminating  evaluation".  The 
non-commutativity  refers  to  the  logic  connectives  V  and  A  in  the  extended  logic.  This 
phenomenon  appears  in  McCarthy  (1967)  in  which  it  was  shown  that  in  order  to  define 
computable  partial  functions,  it  is  necessary  to  allow  undefined  expressions  in  the 
recursive  formulae.  From  a  logical  viewpoint,  this  amounts  to  considering  a  third  truth 
value  "u"  for  these  undefined  expressions. 

Consider,  for  example,  defining  recursively  the  function  f{n)  =  n\  on  the  domain  of 
non-negative  integers.  A  verbal  rule  is  "if  n  =  0 ,  then  assign  the  value  1,  else  assign  the 
value  n(n  -  i)!"  The  statement  "If  ...,  then  ...,  else  If  ...  then  ..."  is  called  a  "conditional 
expression".  In  symbols,  a  conditional  expression  is  denoted 

(aj  -* bj,  a2  -» b2, . . . ,  an  -» bj  =  CE{aj, . . . ,  an;  bj,  . . . ,  bj, 
which  means  "if  a7  then  b7,  else  if  an  then  b~,  ...  ,  else  if  a  then  b  ."  Its  value  is 

11  Xf  Jj  ft  ft 

defined  as  CE(a7,  ...,  a:  b7,  ...,  b)  =  b-  where  /  is  the  first  i  such  that  a-  is  true. 

1  ft  1  ft  J  * 

The  evaluation  of  CE(aj,  .  .  .  ,  an;  bj,  .  .  . ,  bj  proceeds  from  left  to  right,  and  stops 
when  the  first  true  a-  is  found.  Of  course,  the  a.  and  b-  are  propositions,  that  is,  can  only 
be  true  (T)  or  false(F).  That  is,  we  are  in  classical  two-valued  logic,  where  the 
propositions  are  elements  of  a  Boolean  ring  R  with  the  usual  connectives.  If  T  and  F  also 
stand  for  "always  true"  and  "always  false",  respectively,  then  the  usual  Boolean 
connectives  can  be  expressed  in  terms  of  some  simple  conditional  expressions.  Indeed, 
using  mth  tables  in  two-valued  logic,  it  is  readily  checked  that 

at\b=  {a-*b,T-*F), 
aVb  =  (a-*T,T-*b), 
a'  =  (a  -*  F,T -*T), 
a'  Vb  =  {a-*b,T->T). 
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Now  consider  the  partial  function  f(n)  =  nl  on  the  set  of  integers.  The  recursive 
definition  of  jin)  in  terms  of  conditional  expressions  is 

»!  =  (n  =  0  -» 1,  n*  0  -*  n(n  -  2)!). 

Thus 

2\  =  (2  =  0 -*  1,2  *0 -*2(2  ~1)\)  = 

2(2!)  =  2(1  =  1 ->1,1*0 -*1(1- 1)1)  = 

•  2  (2)  (0  =  0  -* 1, 0  *  0  0(0  - 1)1)  =  2(1)(1)  =  2. 


Note  that  (0  -  1)\  is  undefined.  To  carry  out  the  computation  above,  it  is  necessary  to 
allow'  the  conditional  expression  to  be  defined  even  if  the  term  beyond  the  one  that  gives 
the  value  is  undefined.  Thus  in  a  general  CE(a j, . . . ,  a^,  bj, ... ,  bj,  one  should  allow 
the  situation  where  a.  or  bj  are  undefined,  which  means  that  the  range  of  truth  values  of 
each  "proposition"  is  extended  to  [T,  u,  F).  In  this  logic,  the  CE  are  defined  as  follows: 


CE(aj,  ...,an;bj,...,  bn)  -  bj 


if  there  is  a  bj  which  is  "defined",  and  a-  is  false  for  i  <  j,  and  if  undefined  otherwise. 
Thus  CE(aj, ...  ,an;  bj, ... ,  bj  is  undefined,  that  is,  has  truth  value  "u",  when  either 

(i)  all  the  a.  are  false,  or 

(ii)  a-  is  false  for  i  <  j,  Oj  is  true,  and  bj  is  undefined,  or 

(iii)  there  is  an  undefined  a-  before  a  true  Oj. 

From  this,  it  becomes  clear  that  the  extended  connectives  V  and  A  are  not  commutative. 


Indeed,  from  a  Kb  =  (a-*b,  T  -*  F),  if  a  is  F  and  b  is  u,  then  the  value  of  a  A  b  is  F, 
which  has  truth  value  F.  while  b  A  a  has  value  undefined  (by  (iii)  above)  with  truth  value 
u.  Similarly,  if  a  is  T  and  b  is  u,  then  a  V  b  is  T,  but  b  V  a  is  u. 

This  non-commutative  three-valued  logic  in  mechanical  computation  theory,  bearing 
the  name  of  "conditional  logic"  because  of  the  role  played  by  conditional  forms  in 
recursive  computations,  seems  not  to  be  in  the  mainstream  of  multi-valued  logic. 


3.5  Comparison  of  various  systems  of  logical  operators. 

In  this  section,  we  are  going  to  examine  three  systems  (',  A,  V)  of  basic  connectives 
on  R  |i?.  The  truth  tables  of  these  systems  will  be  constructed,  and  the  relative  merits  of 
these  systems  will  be  discussed.  These  systems  have  been  chosen  because  they  have  been 
studied  to  some  extent  as  algebraic  systems.  Indeed,  the  last  one  (Goodman  and 
Nguyen’s)  is  elaborated  on  at  length  in  Chapter  4.  Some  connectives  on  R\R  arising  from 
the  truth  tables  of  several  three-valued  logics  will  be  constructed.  These  particular 
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three-valued  logics  are  chosen  because  of  their  particular  interest  and  importance  in  the 
field.  Our  prinicipal  tool  is  Theorem  2  of  Section  3.4.  and  the  comments  following  it  with 
regard  to  making  the  necessary  constructions. 

We  begin  with  the  definitions  and  a  bit  of  discussion  of  the  three  systems  (',  A,  V)  of 
connectives  to  be  examined.  The  recent  paper  by  Dubois  and  Prade  (1990)  is  relevant 
here.  All  the  systems  have  the  same  negation  '  on  R\R,  given  by  (a\b)'  =  (a'b\b). 
This  is  in  agreement  with  the  negation  operator  in  the  Boolean  ring  R/Rb'. 

In  his  1968  paper,  Schay  investigated  the  two  systems  which  follow. 

Schay's  First  System 
(a\b)  A  (c|d)  =  ((b'  V  a){d'  V  c)\b  V  d), 

(i a\b )  V  (c| d)  =  (ob  V  cd\b  V  d). 


In  his  original  formulation  of  this  system  Schay  (1968,  p.  338),  wrote  the  operations 
slightly  differently.  Conjunction  was  given  as 


But 

and 


( a\b )  A  (c|d)  =  (( abed  V  abd'  V  cdb')\b  V  d). 

(b'  V  a)(d'  V  c)  =  b'd'  V  b'c  V  ad'  V  ac, 

(< b'd '  V  b'c  V  ad'  V  ac)  A  (6  V  d)  = 
b'cd  V  ad'b  V  acb  V  acd  = 
abed  V  abd'  Y  cdb'. 


Disjunction  was  given  by 

(a|h)  V  (c|d)  =  (( a  V  c)bd  V  abd'  V  cdb' \b  V  d), 
and 

((a  V  c)hd  V  abd'  V  cdb')  A  (6  V  d)  = 
nhd  V  hed  V  flhd'  V  cdb'  =  ob  V  cd. 

Thus  the  two  formulations  are  the  same.  After  Schay,  Adams  (1975)  and  Calabrese 
(1987)  proposed  this  same  system. 


Schay’s  Second  System 
( a\b )  A  (c|d)  =  ( ac\bd ), 
(a|b)  V  (c|d)  =  (a  v  c|bd). 
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In  their  work  on  the  foundation  of  the  Bayesian  approach  to  statisticss,  Bruno  and  Gilio 
(1985)  considered  connectives  on  conditional  events  corresponding  to  disjunction  of 
Schay's  first  system  and  to  conjunction  of  Schay's  second  system. 


Goodman  and  Nguyen's  System 
(a\b)  A  (c\d)  =  (ac\(a'b  V  c'd  V  bd)), 

(a\b)  V  (c|d)  =  (a  V  c\(ab  V  cd  V  bd)). 

This  system  arose  from  Goodman  and  Nguyen's  efforts  (1988)  to  extend  operations  on  R 
(events)  to  those  on  R\R  (conditional  events)  which  would  be  consistent  with  conditional 
probability,  and  resulted  from  realizing  the  elements  of  R\R  as  cosets  of  principal  ideals 
of  R.  These  operations  are  set  forth  in  Section  32. 

Below  is  a  table  of  these  three  systems  of  connectives  in  terms  of  conditional 
Boolean  polynomials.  Following  this  table  is  a  table  of  several  well-known  three-valued 
logical  systems.  We  make  a  slight  change  of  notation.  For  n  =  2,  the  Boolean 
polynomials  involved  are  functions  of  four  variables,  and  it  is  convenient  to  denote  those 
variables  A ,  C,  B,  and  D,  rather  than  as  Xit  X2,  X2,  and  X4.  We  remind  the  reader  that  for 
n  =  2,  the  evaluation  of  a  conditional  Boolean  polynomial  is  given  by 


Thus  if 

then 


f\g((a\b),  (c|d))  =  (f{ab,  cd,  b,d)\g(ab,  cd,  b,  d)). 

f\g  =  (AD'  V  CB'  S  AC\B  VD) 
f\g{{a |£)>  (cjd))  =  ((abd'  V  cdb'  V  abcd)\b  V  d). 


System 


A 


V 


Schay's  first 
Schay's  second 
Goodman-Nguyen 


AD'  V  CB'  V  AC\B  V D 
AC\BD 

AC\A'B  V  C'D  V  BD 


A  V  C|£  VD 
A  SC\BD 
A  V  C\A  V  CV  BD 


The  conditional  Boolean  polynomials  above  for  V  and  A  just  reflect  the  formulas 
for  (a 1 6)  V  (c|d)  and  (a\b)  A  (c|d).  The  polynormals  are,  of  course,  not  in  disjunctive 
normal  form,  but  are  in  much  simpler  forms.  We  have  not  written  the  relevant 
polynomials  for  7  since  they  are  all  A'B\B,  using  A  and  B  for  the  two  variables. 

In  truth  tables  below,  0, 1,  and  u  are  used  for  (0\1),  (7 1  /),  and  (0\0),  respectively, 
and  x  and  y  for  (a\b)  and  (c|d),  respectively. 
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Bochvar’s  three-valued  logic. 
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Heyting  s  uiree-valued  logic 
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Kleene’s  three-valued  logic 
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Lukasiewicz s  three-valued  logic 
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Sobocinski’s  three-valued  logic 


x  A  y  x  V  y 


X 

x' 

x\y 
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0 
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.X  -♦  y  x  h  y 


x\y 

0  1  u 

x\y 

0  1  u 

0 

1  1  1 

0 

0  0  0 

1 

0  0  1 

1 

0  0  1 

u 

0  1  u 

u 

0  0  u 

We  will  now  compute  the  truth  tables  for  Schay's  first  system.  For  all  three  systems, 
we  have  for ' 


(0|i)'=(0'i|i)  =  (2|I), 

(/|2)'=(2'7|i)  =  (0|i), 

(0|0)'  =(0'0|0)  =  (0|0). 

The  disjunction  A  V  C\B  V  D  for  that  system  is  simple  enough  so  that  its  table  can  be 
written  down  easily.  To  get  the  table  for  A,  we  make  the  following  calculations,  which  is 
just  an  exercise  in  evaluating  the  conditional  Boolean  polynomial 

f\g=AD'  VCB'  V  AC\B  V  D. 
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(0 1 1)  V  (0 1  2)  =  (0  V  0  V  0 1 1  V 1)  =  (0 1 2), 
(0|2)  V  (2 12)  =  (0  V 0  V 0|(2  V  2)  =  (0|2), 
(0|2)  V  (0|0)  =  (0  V  0  V  0|2  V  0)  =  (0|2), 
(2|2)  V(0|2)  =  (0V0V0|2  V2)  =  (0|2), 
(2|2)  V  (2 j2)  =  (0  V 0  V 2|2  Y 2)  =  (2|2), 
(2 12)  V  (0|0)  =  (2  V  0  V  0|2  V  0)  =  (2 |2), 
(0|0)  V  (0|2)  =  (0  V  0  V  0|0  V  2)  =  (0|2), 
(0|0)  V  (2 1 2)  =  (0  V  2  V  0|0  V  2)  =  (2 1 2), 
(0|0)  V  (0|0)  =  (0  V  0  V  0|0  V  0)  =  (0|0). 


Thus  we  get  the  following  truth  tables  for  Schay's  first  system. 
_ _ _  _  xK  y _  _  xV  y 


X 

x\y 

CM  212  010 

012  212  010 

Oil 

212 

Oil 

Oil  012  m 

012 

00  1U  012 

212 

012 

21 2 

012  212  212 

212  . 

212  212  212 

010 

010 

010 

012  212  010 

010 

012  212  010 

These  tables  are  recognized  as  those  ofSobocinski’s  three-valued  logic. 


Next  we  construct  the  truth  tables  for  Schay's  second  system.  Since  the  operations 
are  particularly  simple,  namely  ( a\b )  A  (c\d)  =  (ac\bd)  and  (a\b)  V  (c| d)  =  (a  V  c\bd), 
these  tables  can  be  written  down  easily.  Here  arc  the  tables. 


xK  y _  _ xV  y 


X 

x' 

x\y 

012  212  010 

012  212  010 

Oil 

212 

Oil 

Oil "0Q  019 

Oil 

012  212  010 

111 

012 

212 

012  212  010 

ill 

212  212  010 

010 

010 

010 

010  010  010 

010 

012  010  010 

The  tables  are  recongized  as  those  ofBochvar's  three  valued  logic. 

Now  to  the  Goodman-Nguyen  system.  For  the  connective  V  for  this  system,  we 
make  the  following  calculations  using  the  fonnula  for  V  for  that  system. 


(0|2)  V  (0J2)  =  (0  V  0|0  V  0  V  (2  A  2))  =  (0| 2), 
(0|2)  V  (2 12)  =  (0  V  2|0  V  2  V  (2  A  2)),  =  (2|2), 
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(0J2)  V  0J0  =  (0  V  0\0  V  0  V  (i  A  0))  =  (0|0), 
a|J)  v  (01 J)  =  (7  V  0\1  V  0  V  (2  A  I))  =  (2|2), 
(2|2)V(2|2)  =  (2V2|2V2V(2A2))  =  (2|2), 
(2J2)  V  (0|0)  =  (2  V  0|2  V  0  V  (2  A  0))  =  (2  j2), 
(0J0)  V  (0|2)  =  (0  V  0J0  V  0  V  (0  A  2))  =  (0[0), 
(0|0)  V  (2  j2)  =  (0  V  2 10  V  2  V  (0  A  2))  =  (2 12), 
(0|0)  V  (0j0)  =  (0  V  0J0  V  0  V  (0  A  0»  =  (0j0). 


Making  the  analagous  calculations  for  A,  and  putting  the  results  in  the  usual  form  for  truth 
tables  yields 


1 _ „  _  ,  *v  y 


X 

xr 

x\y 

012  212  010 

012  212  010 

012  ' 

212 

012 

0i2  “0i2  00 

012 

~im  m  ~oto — 

212 

012 

212 

012  212  010 

212 

212  212  212 

|  010 

010 

010 

012  010  010 

010 

010  212  010 

These  are  recognized  as  truth  tobies  for  ' ,  A,  and  V,  respectively,  for  Lukasiewicz’s  and 
Kleene's  three  valued  logics.  These  three-valued  logic  are  well  established,  and  serve  as  a 
strong  motivation  and  justification  for  the  Goodman-Nguyen  operations  \  A,  and  V  on 
R 1 2?.  Further,  the  tables  for  V  and  A  are  the  truth  tables  for  A  and  V  for  Heyting’s 
three-valued  logic.  Thus,  once  R  [2?  is  at  hand,  there  are  strong  reasons  from  a 
three-valued  logical  perspective  to  define  the  operations  V,  A,  and  7  on  R\R  as  done  by 
Goodman  and  Nguyen  and  for  making  a  thorough  study  of  the  resulting  algebraic  system. 

To  illustrate  the  method  of  constructing  a  conditional  Boolean  operator  of  R\R  from 
a  truth  table,  we  construct  that  operator  for  Lukasiewicz's  -»  and  for  Sobocinski’s 
disjunction.  Following  are  those  truth  tables  in  terms  of  the  elements  of  V\V,  from  which 
it  is  easy  to  make  the  necessary  calculations. 


x-i  y _  _  xV  y 


x\y 

012  212  010 

x\y 

012  212  010 

012 

“70  717  717 

012 

012  212  00 

212 

012  212  010 

212 

212  212  212 

010 

010  212  212 

010 

012  212  010 

The  conditional  Boolean  polynomial  fig  for  -♦  is  determined  by  the  following  values  for 
/  and  g,  which  are  read  off  from  the  table  for  -> . 
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A0,0,1,1)  =  1, 

g(0,0,l,l)  =  l , 

A0,1,1,1)  =  1, 

g(0,l,l,l)  =  l , 

A0,0,1,0)  =  1, 

g(0,0,l,0)  =  l. 

AI,0,1,1)  =  0, 

g(I,0,l,l)  =  l , 

Al,i,l,D  =  l, 

g(l,l,l,l)=l. 

A1,0,1,0)  =  0, 

g(l,0,l,  1)  =  0, 

A0,0,0,1)  =  0, 

g(0,0,0,l)  =  0. 

A0,1,0,1)  =  1, 

g(P,  1, 0, 1)  =  I, 

A0,0,0,0)  =  L 

g(0,0,0,0)  =  2. 

Thus  /  and  g  are  the  Boolean  polynomials 

f= A'C'BD  HA'CBD  Y  A'C'BD'  Y  ACBD  HA'CB'D  Y  A'C'B'D' 

=  AC  V  A'B  V  B'C  V B'D' . 

and 

g  = A'C'BD  HA'CBDH  A'C'BD'  V  AC' BD  V  ACBD  HA'CB'D  V  A'C'B'D' 
=  A ' C'BD  HA' CBS  A'BD'  V  AC'D  HACHB'CHB'D' 

=A'B(C'D  V CHD')  V A(C'D  HQHB'CHB'D' 

=  A'BHADHB'ChB'D' 

=  CHA'BHADHB'D'. 

Thus  Lukasiewicz's  -*  is  given  by  the  formula 

(alb)  -*  (dd)  =  (acH  a'b  V  6'c  V  b'd'la'b  Y  ad  V  b'c  V  b'd') 

Similarly,  the  /  and  g  for  Sobocinski’s  disjunction  have  the  values 


AO,  0,1,  1)  =  0, 
AO,  1,  = 

A0,0,1,0)  =  0, 
Al,  0, 1, 1)  =  I, 


g(0,0,l,  /)  =  /, 
/)  =  /, 
g(0,0,l,0)  =  J, 
gd.0,1,  /)  =  ;, 
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f(l>  lt  1, 1)  =  lj  g(l,  2,2, 1)  =  1, 

f(l,0,l,0)  =  l,  g(fO,l,l)  =  L 

M0,0,1)  =  0,  g(0, 0,  0,  1)  =  ly 

M1,0,1)  =  1,  g(0,l,0,l)  =  l, 

M0,0,0)=0,  g(0>0,0,0)  =  0. 

Thus  /  and  g  ait  the  polynomials 

/=  A' BCD  V  ABC'D  V  ABCD  V  ABC'D'  V  A'B'CD 

=  A'BCV  AC'D  SAB  WAD'  WB'C 

=  A(C'D  V  B  V  D')  V  C(A'B  V  B') 

=  A  V  C 
and 

g-(A'B'C'D')'  =AVB  VCVD  =  CVD. 

Thus  the  formula  for  Sobocinski’s  disjunction  is 

(alb)  V  (dd)  =  (a  Y  db  V  a), 
which,  of  course,  we  already  knew. 

We  now  illustrate  the  use  of  the  second  proof  of  Theorem  2  in  constructing  truth 
tables  for  various  three-valued  logical  operators. 

0)  Negation  operators.  For  the  negation  operator  given  by 

(a\b)'  =(a'\b)  =  (a'b\b), 
the  partition  induced  by  (ajb)  is 

w0(a|b)  =  a'b,  w^ajb)  =  ab,  w„(a|b)  =  b'. 

Thus 

(a\b)'  -  (a(a,  b)\f%a,  b)) 

where 

cdfl,  b)  -  a'b 
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and 

fKa,  b)  =  b  =  ab  V  a'b. 

Thus 

m  =  {0},m  =  {0,ll 

and  hence 

for  i  e  J(a)nJ(p)  =  {0},  ¥  ,  (0)  =  1, 
for  i*  e  J°(0)  =  {a},  yf  *  (a)  =  a,  and 
for  i  e  m  a  J°(a)  =  UK  Y  ,  (J)  =  0 

This  is  Lukasiewicz  truth  table  for  negation  (see  Section  3.4). 


(ii)  Goojnnctioa  operators.  The  conjunction  operator  A  of  Scha/s  first  system  is 
given  by 


(a|b)  A  (c\d)  =  ((b'  Y  aXd'  V  c)  jb  V  d)  = 


((&'  V d)(d'  V  CX&  Y  4)|(b  V  d)  =  («bti'  V  cdb'  V abcti|b  V 4). 
Thus  a.,  or  a  for  short,  is 

A 

wi(a\b)wa(c\d)  Y  Wi(c|d)w0(a|&)  V  wifalb^fcld), 

so  that 

J(a)  =  {(/,a),(a,J),(i,i)}. 


Next, 

f}  =  bvd  =  bdvb'dvbd' 

=  (abed)  V  (abe'd)  Y  (a'bca)  V  (a'bc'd)  v  (abd')  V  ( b'cd)  V  (a'bd')  V  (b'c'd)  . 
Thus 

d(0)  =  {(/,  i),  (/.  0),  (0. 1).  (0, 0),  (i,  a),  (a,  i),  (0,  a),  (a,  0)} 


and 


/(a)  n/(fi)  =  {(/.  i).  (a,  i),  (7,  a)} 

/(0>  =  ((u,u)  1 

mnfya)  =  [(/.  0).  (0.  /),  (0. 0).  (0,  b),  <b,  0)J . 
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Therefore 


Yh«,j)  = 


1  for  (i,  j)  e  {(2,  2)  (u,  i),  (2, n)} 
u  for  (i,j)  =  (m»  «) 

0  for  (i\j)  e  {(2,  0),(0, 2),  (0, 0),  (0,  u),  (u,  0)}  . 


This  is  Sobocin ski's  truth  table  for  conjunction. 

For  Schay's  conjunction  in  his  second  system, 


(a\b)  K(c\d)  =  ( ac\bd) , 

and  one  obtains 


Vhd>J)  = 


2  for  (i,j)  -(1,1) 

u  for  (i,j)  e  {(0,  u),  (u,  0),  (u,  u),  (u,  1),  (2,  u)}  . 
<•  0  for  (i,j)  e  {(0,  0),  (0,1),  (1,0)) 


This  is  Bochvar’  truth  function  for  conjunction. 
For  the  Goodman-Nguyen  conjunction. 


(i a\b )  A  (c|d)  =  ( ac\a'b  V  c'd  V  bd)  =  (abcd|a'b  V  c'd  Vbd), 


whence  a  =  abed  and  so 

m  =  {(2,  2)}. 

fi  =  a'b  V  c'd  V  bd 
=  a' by  c'd  V  (Mac  V  bd(ac)') 

-  (abed)  V  bda'  V  a'b  V  c'd  V  bde' 

=  ( abed)  V  bda'  V  (a'bd  V  a'bd')  V  (c'db  V  c'db ')  V  bde' 

=  (abed)  V  bda'  V  a'bd'  V  c'db  V  c'db' 

=  (abed)  V  (( a'b)c'd)  V  (( a'b)cd)  V  (a'bd')  V  ((abe'd)  V  (a'bc'd)  V  c'db', 

so  that 

/($  =  {(2,  2),  (0,  0),  (0, 1),  (0,  u),  (1,  0),  (u,  0 )}  . 

Hence 


and 


J(a)n7(@  =  ((7,  j)), 
/(@  =  ((u,  u),  (u,  i),  (/,  u)}, 
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J{p)  n/(a)  =  {(i 0 ,  0),  (0, 7),  (0,  u),  (7, 0),  (u,  0)},  £) 

so  that 

r7  /or  (ij)=(l,l) 

YS(U)  =  u  for  (i,j)  e  {(u,  u),  (u,  7),  (7, «)} 

1.0  for  (i,j)  e  [(0,  0),  (0, 1),  (0,  u),  (l,  0),  (u,  0)} 

which  is  Lukasiewicz  truth  table  for  conjunction  (see  3.4). 


(iii)  Disjunction  operators.  Adams  and  Calabrese's  disjunctions  are  identical  to  the 
disjunction  in  Schay's  first  system,  which  is  given  by 

(a\b)  V  (c|rf)  =  (ab  V  cd\b  V  rf ). 

We  have 

r  l  for  (i,j)  €  {(0,  1),  (u,  1),  (1,  0),  (1,  u),  (1, 1)} 

W>j)  =  u  for  (Uj)  =  (u,  u). 

to  for  (ij)  e  {(0,  0),  (0,  u),  (u,  0)} 

which  is  Sobocinski's  disjunction. 

The  disjunction  in  Schay's  second  system  is 

(a\b)  V  (c|rf)  =  (o  V  c\bd)  , 


r  7  for  (i,j)  e  {(0,  7),  (7,0),  (7,  7)} 

V<i»7')  =  ■  u  for  (i,  j)  e  {(0,  «),  (u,  0),  (u,  u),  (u,  1),  (7,  u)}  , 
•0  for  (i,j)  =  (0,0) 

which  is  Bochvafs  disjunction. 

For  the  Goodman-Nguyen  disjunction, 


so 

Thus 

and 


(a\b)  V  (c(rf)  =  (a  V  c\ab  V  erf  V  bd) 

=  (ab  V  erfjob  V  erf  V  bd), 

a  =  ab  V  erf 

=  (a&rf'  V  a&rfc  V  abdc')  V  (cdb'  V  erfha  V  cdba') 

=  (chcrf  V  a£rf'  V  erffr'  V  a&c'rf  V  cdba'). 

J(a)  =  {(7,  7),  (7,  n),  ( u ,  1 ),  (7,  0),  (0,  7)}, 
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Note  that 


=  abS  cdV  bd . 


bd  -  (< abed)  V  a'bd  V  bc'd 
=  (abed)  V  (a'bdc  V  a'bdc')  V  a'bc'd) , 

and  thus 

/(/3)  =  /(a)u{(0,0)}. 

Hence 


VO  = 


1  for  (i,j)  e  {(i,  1),  (1,  u),  (u,  1),  (1, 0),  (0, 1)) 
■  u  for  (i,j)  e  {(u,  0),  (u,  u),  (0,  u)} 

*•0  for  (i,j)  =  (0,  0) 


which  is  Lukasiewicz  truth  table  for  disjunction. 

Each  proposed  system  in  three-valued  logic  has  its  own  rationale.  Since  logical 
operations  on  conditionals  correspond  to  truth  tables  :‘n  three-valued  logics,  the 
comparison  of  different  algebras  of  conditional  events  is  delicate.  However,  based  on 
Reschefs  discussion  (Rescher,  1969,  pp.  131-133),  we  make  some  comparisons  below.  To 
do  that,  we  first  complete  the  description  of  the  three  algebras,  Schay’s  first  and  second 
system,  and  the  Goodman-Nguyen  system,  by  writing  down  the  syntax  operations 
corresponding  to  the  remaining  two  truth  tables,  namely  for  implication  (  -*  )  and  for 
equivalence  ( ).  (Of  course  x  *-*  y  means  (x  -*  y)  A  (y  -*  x).)  Thus  we  will  have  three 
algebras  of  conditional  events,  corresponding  respectively  to  the  three  three-valued  logics 
of  Sobocinski,  Bochvar,  and  Lukasiewicz.  In  the  following  tables,  the  implication  and 
equivalence  are  expressed  in  terms  of  '  ,  A,  and  V  within  each  system.  As  usual,  we 
generally  denote  A  by  juxtaposition.  Also,  in  R,  the  implication  -+  is  material 
implication.  Here  are  the  tables. 

Schay's  First  System 


(a\b) (c\d)  =  (a\b)'  V(c|d) 

( a\b )  h-»  (c\d)  =  ((a  *-♦  c)bd\b  V  d) 


Schay's  Second  System 

(a|b)-*(c|cO  =  (a-»c|M) 
(a\b) «-» (c\d)  =  (a<~*  c)\bd ) 
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Goodman  and  Nguyen's  System 

(a\b)  -»  (c|d)  =  (b'd'jl)  V  {a\b)r  V  (cjd) 

(a\b)  -*  (c|4)  =  ((a|«  -  (c\d))  A  ((c|d)  -  (a|f>)) 

In  Heyting's  three-valued  logic,  A  and  V  arc  the  same  gs  Lukasiesicz's,  so  in  the 
corresponding  algebra,  A  and  V  are  the  same  as  those  of  Goodman-Nguyen.  Heyting’s 
negation  is  different,  and  is  defined  by 

(a\b)'  =  (a'b\l). 

Goodman  and  Nguyens  A  and  V  make  R\R  into  a  lattice,  and  on  that  lattice,  Heytins's 
negation  turns  out  to  be  a  pseudo-complementation,  making  /?  |/?  into  a  Stone  Algebra. 
The  details  are  in  Chapte  4.  The  operations  on  R\R  corresponding  to  Heyting’s  ->  and 
♦— t  are 


(a\b) -*  (c\d)  =b'd' V  a' by(c\d) 

(a|b)  ♦-»  (c|d)  =  b'd'  V  ((a  *-*c)bd\a'b  V  c'd  V  bd) 

Now,  examining  the  truth  tables  of  the  conjunction  and  disjunction  operators  in 
Schay's  first  and  second  systems,  we  see  that  they  all  violate  plausible  conditions  for 
multi-valued  logics.  First,  viewing  u  as  lying  between  0  and  1,  any  conjunction  A 
should  be  such  that  x  A  y  is  the  "falest"  of  x  and  y.  Likewise,  any  disjunction  V 
should  yield  the  Mtruest"  of  x  and  y  (Rescher,  1969,  p.  133).  Thus  for  Schay's  first 
system, 

u  A  1  and  1  A  u  should  be  0  or  u,  but  not  2,  and 
0  V  u  and  u  V  0  should  be  1  or  u,  but  not  0. 

Likewise,  for  Schay’s  second  system, 

u  A  0  and  0  A  u  should  be  0  but  not  u,  and 
u  V  1  and  1  V  u  should  be  2,  but  not  u. 

Finally,  one  can  simply  require  that  each  logical  operator  on  R\R  should  satisfy  a 
list  of  reasonable  properties.  For  example,  let  A  denote  a  binary  operator  on  R\R 
representing  "conjunction."  Then  the  following  is  such  a  list: 

Pj  (a|2)  A  (c|2)  =  (a  A  c  J  2)  (A  extends  conjunction  of  unconditional  events), 

?2  (a\b)  A  ic\d)  £  (a\b),  (c\d), 
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Pj  A  is  associative, 

P4  A  is  commutative, 

Pj  A  is  idempotent  ((a  |  b)  A  (a  |  b))  =  (a  |  b)), 

P6  (a\b)KO)  =  0, 

P?  (a\b)  M)  =  (a\b), 

P8  (a\b)  h  (c\b)  =  (ac\b), 

Pg  (a\b)  hb  =  ab  (modus ponens), 

P IQ  (a | be)  A  (b|c)  =  (ab\c)  (a  chaining  property). 

All  of  Schay,  Adams,  and  Calabrese^  conjunction  operators  fail  to  satisfy  P2- 
Their  corresponding  disjunction  operators  fail  to  satisfy  the  dual  property 
(a\b)  V  (c|d))£  (a\b),  (c|<i). 

In  terms  of  truth  tables,  A  satisfies  ?2  ^  and  only  if  i  A  j)£min(i,j),  for 
i,j  e  { 0 ,  u,  1}. 

If  A  satisfies  Pj,  P2  and  P^,  then  the  corresponding  truth  table  must  be  one  of 
the  following  four. 


At 

0 

u 

1 

0 

0 

0 

0 

u 

0 

0 

0 

1 

0 

0 

1 

A3 

0 

u 

1 

0 

0 

0 

0 

u 

0 

u 

0 

1 

0 

0 

1 

hi 

0 

u 

1 

0 

0 

0 

0 

u 

0 

0 

u 

1 

0 

u 

1 

A4 

0 

u 

1 

0 

0 

0 

0 

u 

0 

u 

u 

1 

0 

u 

1 

The  table  for  A4  is  Lukasiewicz's  truth  table  for  conjunction,  which  corresponds  to 
the  Goodman-Nguyen  conjunction  operator.  Using  the  Theorem  2  of  3.4,  we  get  the 
operations 

(a\b)  Aj  (cjrf)  =  abed , 

{a\b)  A2  (c|<i)  =  {abcd\abcd  V  a'b  V  c'd  V  b'd ') , 

(a\b)  A3  (c\d))  =  (1 abcd\b  V  d) , 

(a\b)  A4  (c\d)  =  (ac\a'b  V  c'd  V  bd) . 


Now  it  is  easily  checked  that 
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fa  does  not  satisfy  P^,  Py,  Pg  and  Pjq  ; 

A2  does  not  satisfy  Pj,  Pg  ; 

fa  does  not  satisfy  Py . 

Only  A4  satisfies  all  ten  properties! 

In  summary,  in  his  pioneering  work  on  logical  conditional  operators,  Schay  (1968) 
proposed,  at  the  syntax  level,  two  systems.  As  Dubois  and  Prade  (1989,  1990)  have 
pointed  out,  and  as  we  proved  in  Sections  3.4  and  3.5,  Schay's  systems  correspond 
precisely  to  two  well-known  three-valued  semantics,  namely  those  of  Sobocinski  and 
Bochvar.  The  algebraic  approach  to  logical  operations  on  conditionals  presented  in 
Section  3 2  leads  to  a  syntactic  system  corresponding  to  Lukasiewicz's  three-valued  logic. 
The  comparisons  above  suggest  that  each  choice  of  a  logical  system  should  be  dictated  by 
the  situation  at  hand.  This  is  similar  to  the  situation  in  fuzzy  logic  (see  Chapter  7).  In 
particular,  the  choice  between  Lukasiewicz  and  Sobocinski's  logics  is  a  matter  of  debate  as 
far  as  appropriate  semantics  for  conditionals  is  concerned.  See  Chapter  6  for  more  details. 
In  this  book  we  take  the  viewpoint  of  Lukasiewicz,  and  investigate  the  mathematics  of 
conditionals  corresponding  to  his  three -valued  logic. 

3.6  Connection  with  qualitative  probability 

Qualitative  (or  comparative  subjective,  or  objective  propensity)  probability  is 
motivated  by  the  desire  to  make  numerical  probability  measures  compatible  with 
non-numerical  probability  comparisons.  For  a  general  exposition,  see  Fine  (1973,  Chapter 
II).  See  also  Fishbum  (1983),  Villegas  (1967),  Domotor  (1969)  and  Suppes  (1973)  for 
further  background. 

In  general,  qualitative  probability  is  a  kind  of  order  relation  -<  on  a  given  Boolean 
ring  R.  For  a,  b  e  R,  the  relation  a  <  b  is  interpreted  as  "b  is  at  least  as  probable  as 
a."  Then,  for  a  <  b,  one  seeks  probability  measures  P  on  R  such  that  P(a)  <  P(b). 

More  strongly,  one  attempts  to  determine  a  qualitative  probability  •<  and 
quantitative  probability  measures  P  on  R  such  that  a  <b  if  and  only  if  P(a)  £  P(b)  for 
all  a,  b  e  R.  In  this  case,  P  is  called  a  representative  of  <.  In  order  to  achieve  this, 
usually  an  axiom  of  comparability  is  assumed  such  as  a  <b  or  b  i  a  for  all  a,  be  R. 

In  the  following,  we  discuss  Koopman's  conditional  qualitative  probability  system 
(Koopman,  1940).  Interestingly,  Koopman  basically  avoids  use  of  any  axiom  of 
comparability  -  at  least  initially.  Koopman's  axioms  follow  (Koopman,  1960,  page  275). 
They  are  axioms  for  a  system  such  as  our  R  |/?. 
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V  Axiom  of  Verified  Contingency 
(a\b)  <  (c|c). 

I  Axiom  of  Implication 

If  (c\c)  <  (a|&)»  then  c<a. 

R  Axiom  of  Reflexivity. 

(a\b)<  (a\b) . 

T  Axiom  of  Transitivity 

If  (a\b)  <  (c\d),  and  {c\d)  <  (e\ f),  then  ( a\b )  <  (e\f). 

A  Axiom  of  Antisymmetiy 

If  (a\b)  <  (c\d),  then  (c'\d)  <  (a'\b). 

C  Axiom  of  Composition 

Cj  If  (a\b)  <  (c\d)  and  (e\ ab)  <  (f\cd),  then  (ae\b)  <  (cf\d). 

If  (a\b)  (f\cd)  and  (e\db)  <  c\d,  then  (oe\b)  <  ( cf\d ). 

D  Axioms  of  Decomposition 

Suppose  that  <ac|i>)  -( (#|e).  Ifeitherof  (a|»  or  (c\ab)  is  >■  eitherof  (d|e) 
or  (f\de),  then  the  remaining  one  of  (a|i>)  and  (c|ai>)  is  -t  the  remaining  one  of  (d\e) 

and  (f\de). 

P  Axioms  of  Alternative  Presumption 

If  (a\bc)  <  (d\e)  and  (a\(bc)f)  <  {d\e\  then  (ok)  <  (d\s). 

S  Axioms  of  Subdivision 

For  any  integer  n,  let  the  propositions  a p  a2> ...» an>  bp  b2> ...,  bn  be  such  that 
af^bb.^0  for  i*j;  a  =  a,  va2  V  ...  Va^O;  b  =  b,  V  b2  V ...  Vi>n*0; 
(i aj\a )  <  {a2\a)  <  ...  «  (o„| a);  (bj\b)  1  (P2\b)  <  ...  <  (bjb).  Then  (flj\a)  <  (bn\b). 

Koopman  derives  many  properties  of  his  axioms.  Our  purpose  here  is  to  show  that 
the  relation  <  on  R\R  that  we  introduced  in  Section  3.3  satisfies  all  but  the  first  and  last 
of  Koopman's  axioms  system. 
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Theorem.  Let  R  be  a  Boolean  algebra.  Then  <,  defined  on  R\R  by  (a|fc)£(c|d)  if 
and  only  if  (a\b)  =  {a\b){c\d),  satisfies  Koopmaris  axiom  I,  R,  T,  A,  C,  D,  P,  and  S.  It 
does  not  satisfy  his  axioms  V  and  S. 

Proof.  Throughout  we  will  use  our  critr.'oa  that  (a|h)£(c|d)  if  and  only  if 
ab£cd  and  c'd  £  a'b.  (Theorem  1,  Section 

Axiom  V  is  clearly  false.  Just  pick  ab>c 

Axioms  I  and  R  follow  almost  trivially  our  criterion  above,  and  axiom  T  is 
noted  immediately  after  the  definition  of  ^  in  Se  ction  3.3. 

Axiom  A  is  part  of  our  criterion  for  (a\b)  '•  \d). 

To  verify  the  first  part  of  C,  let  (a\b)<,  (»:  *./.  ind  (e\ab)  -<  (f\cd).  Then 

ab£cd , 

c'dZa'b , 

eab  £fcd , 
and 

fed  £  e'ab. 

To  get  (ae\b)  £  (cf\d),  we  need 

aeb<>  cfd 
and 

(cf)'d  £  (ae)'b. 

The  first  we  have,  and  from  f  cd  £  e'ab,  we  get 

eVa'yb'  £/Vc'  Vd', 


and  from  c'd  £  a'b  we  get 


Thus 


a'  V  b<,cV  d' . 


or 

or 

or 


(eVa'  V  b')(> u  yb)<,ifyc'  V  a')(c  V  d'), 
e(a'  V  b)V  a'  ZftcV  d')V  d' , 
a'  V  eb<id'  V fc, 
aifi'  Vb')Zd(c'  V/'), 
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which  we  needed. 

For  the  second  part  of  C,  let  (a\b)  £  (f[cd)  and  (e\ab)  <,  (c|d).  Then 


and 

We  need 

and 


ab  £fcd, 
fed.  <.  a'b, 
eab  £  cd, 

c'd  £  e'ab. 

aeb  <1  cfd 

(cf)'d  <,  (ae)'b. 


Since  ab  cfd,  certainly  aeb  <,  cfd.  So  we  need  only  that 

(cf)'d  ( ae)'b , 


or  equivalendy  that 


Since 


aeVb'  <,cfy  d'. 


and  since 


Thus 


f'cd<,a'b, 
ayb'  <,fy c'  Sd', 

c'd  <.  e'ab, 
e  V  a'  V  b'  <>c  yd'. 

(a  yb'){ey  a'  yb')z(fy  c'  y  d'){cy  d'), 


and  so 


(ayb')eyb'  zfey  d')y  d' , 


or 


or  finally 


aeyb'^fey  d', 
( cf)'d<,(ae)'b , 


which  we  needed. 

To  verify  D,  the  axioms  of  decomposition,  suppose  that 
(a\b)  >  {d\e).  We  need  that  (c\ab)  <,  (f\de).  Thus 


(ac\b)  <  (df\ e) 


and 
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We  need 


and  the  first  we  have.  From 


we  have 


and  since 


we  have 


acb  £  dfe, 
{df)'e<k{ac)'b, 
de<,ab, 


a'bZd'e. 


cab  £  fde 


fde  £  c'ab, 


0 df)'eZ(ac)'b 


( d '  \‘f)e£(a'  Vc')b, 


de  <ab, 


id'  Vf)de£(a'  V  c')ab, 


fde  £  c'ab, 

which  we  needed. 

Assume  now  that  (a\b)  >  (f\ de),  and  that  we  have  always  that  (ac\b)<  (df\ e ).  We 
need  that  (c\ab)<,  (d\e).  So  we  are  given 

fde  <1  ab, 
a'b  Zf'de, 
acb  £  cfe, 


and  we  want 


(cf)'e<(ac)'b, 


cab  <  de 


d'e  <  c'ab. 


Since  acb  <  dfe,  then  cab  <>  de.  So  we  only  need  that 


d'e  <,  c'ab. 
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From 


(df)'e  <,  (ac)'b. 

and 

a'b^fde. 

we  have 

Of  Sf)e  <,  {a'  V  c')b 

and 

fVd'  Se'  <,aVb', 

so 

Of  V/Mfv  d'  V  e')  <;  (a'  V  c')b(a  V  b'). 

Thus 

Of  yf)e(f\d')^(a'  Vc')ab, 

or 

d'e  £  c'ab, 

which  is  the  inequality  we  needed.  The  other  two  parts  are  similar  and  their  proofs 

left  to  the  reader. 

To  verify  axiom  P,  let  (a|6c)  £  (d\e)  and  (a\(bc)')  £  de.  Then 

abc<,de. 

d'e<,a'bc 

a(jbc)'  <,  de, 

and 

d'e  £  a' (be)'. 

We  need  that 

(a\c)Z(d\e), 

or  that 

ac<de 

and 

d'e  <1  a'c. 

Now  d'e^a'c  since 

d'e<a'bc.  To  get  ac^de,  from  a(bc)'  <de  we  have 

Then 


a(b'  V  c')  S  de. 


whence 


ab'  V  ac'  <>de, 
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We  have  from  above  that 


Then 


ab'  <>de. 

abc&de. 


ab'c  V  abc  -  ac-^de 


□ 


and  this  proof  is  complete. 

The  axiom- of  subdivision  obviously  does  not  hold  for  our 
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CHAPTER  4 

ALGEBRAIC  STRUCTURE  OF  CONDITIONAL  EVENTS 

This  chapter  is  devoted  to  the  study  of  the  space  of  conditionals  i?|i?  as  an 
algebraic  system.  Equipped  with  the  logical  operations  V,  A,  and  '  introduced  in 
Chapter  3,  it  is  a  system  generalizing  Boolean  algebras,  or  Boolean  rings,  and  provides  a 
vehicle  for  manipulating  conditional  events,  analogous  to  the  manipulation  of  events  in 
Boolean  algebras.  Further,  it  represents  a  departure  from  classical  logic,  and  from 
quantum  logic.  First,  in  Section  4.1,  we  examine  the  basic  algebraic  properties  of  R  |R, 
concentrating  on  its  similarities  and  its  differences  with  those  of  Boolean  algebra.  In 
Section  4.2,  R\R  is  characterized  as  an  abstract  algebraic  system,  and  a  Stone 
Represen.  ' m  Theorem  is  established,  generalizing  Stone's  theorem  for  Boolean  algebras. 
In  Section  4.3,  R\R  is  identified  with  a  semi-simple  MV  algebra  via  the  work  of  Belluce 
(19S6),  yielding  a  connection  with  multi-valued  logic,  and  providing  yet  another  Stone 
Representation  Theorem. 

4.1  Base  algebraic  properties 

We  now  turn  to  a  detailed  examination  of  R\R  as  an  algebraic  system.  Recall  that 
R\R  is  the  set  of  all  cosets  of  all  principal  ideals  a  +  Rb'  of  the  Boolean  ring  R,  and  we 
have  adopted  the  notation  (ajh)  for  the  coset  a  +  Rb'.  In  the  Boolean  ring  R,  there  are 
the  usual  operations  V,  A,  +  ,  and  ',  and  R  has  a  0  and  a  I.  We  assume  as  known  the 
basic  properties  of  Boolean  rings,  or  Boolean  algebras.  Corresponding  operations  V,  A,  -r 
,  and  '  have  been  defined  on  R\R,  and  some  of  their  properties  have  been  noted  in 
earlier  sections.  Specifically, 

(1)  (a|6)  V  (c|d)  =  ((aVc) | (ab  V  cd  V  bd))  =  (oh  V  cd\ab  V  cd  V  a'bc'd), 

(2)  (a\b)  A  (cj d)  =  ((a Ac) \(a'b  V  c'd  V  bd))  =  (abcd\a'b  V  c'd  V abed), 

(3)  (a\b)' =  (a' \b), 

(4)  (a\b)  +  (c\d)  =  (a  +  c\bd). 

Above,  if  x,  y  e  R,  then  xy  is  written  for  x  A  y.  Note  that  the  symbols  V,  A,  + 
and  '  are  used  both  as  operations  in  the  Boolean  ring  R  and  as  operations  in  R\R.  Tnc 
context  should  always  make  it  clear  what  operation  is  meant.  Tne  most  basic  and 
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elementary  algebraic  properties  of  R  |i?  are  these: 

Theorem  1.  The  following  hold  ini?  JR. 

(1)  (o|d)  V  (o|d)  =  (a\b)  (V  is  idempotent); 

(2)  (c|d)  A  (cjd)  =  (a\b)  (his idempotent); 

(3)  (a\b)  V  (c\d)  =  (c\d)  V  (c|d)  (V  is  commutative);  , 

(4)  (c|d)  A  (c\d)  =  (cjd)  A  (a|d)  (A  is  commutative); 

(5)  (c|d)  +  (c\d)  =  (cjd)  +  (a\b)  (-*-  is  commutative); 

(6)  (ffl\b)  V  (c\d))  V  (e\f)  =  (c|d)  V(c|dVe[/)  (V  is  associative); 

(7)  ((a 1 6)  A  (c|d))  A  (ej/)  =  (c|d)  A  (cjd  he\f)  (Ms  associative); 

(8)  ((a\b)  +  (cjd))  +  (e|/)  =  (a|d)  +  (c\d  +  e\f)  (+  is  associative); 

(9)  (a\b)"  =  (a\b)  ('  is  involunve). 

Proof.  The  proofs  of  these  are  routine  from  the  definitions  of  the  operation. 
However,  we  give  proofs  of  (1),  (4),  (5),  (6),  (7)  and  (9)  as  illustrations  of  elementary 
manipulations  in  R  |i?. 

(1)  (a\b)  V  (a\b)  =  ((cVc)j(cd  VabVb))  =  (a\b). 

(4)  (cjd)  A  (c\d)  =  (ad|(c'd  V  c'd  V  dd))  =  (ba\(c'd  V  a'b  V  db))  =  (c\d)  V  (a\b). 

(5)  (a\b)  +  (c\d)  =  ((a  -r  c)\cd)  =  ((c  +  a)jdc)  =  (c\d)  4-  (c|d). 

(7)  ((a \b)  A  (c|d))  A  (e\ f)  =  (ac|(a'd  V  c'd  VM))A  (e[/> 

=  (oce|((cc)'(a'd  V  c'd  Vdd)  Y  e'/V  (c'd  V  c'd  V  dd)0- 
a\b  A  (c|d  A  ej/))  =  (a\b)  A  (cej(c'd  V  e'/V  40) 

=  (cce|(a'd  V  (ce)'(c'd  V e'/V  40  V d(c'd  V e'/V  40)- 
Tnus  we  need  to  show  that 

(ac)'(a'b  V  c'd  V  dd)  V  c'/ V  (a'd  V  c'd  V  dd)f = 

(c'd  V  (ce)'(c'd  V  e'/V  40  V  d(c'd V  e'/V  40)- 


Tne  first  is 


(cc)'a'b  V  (ac)'c'd  V  (cc)'dd  v  e'/V  a'd/V  c'4f  V  dcj= 
c'd  V  c'c'd  V  c'c'd  V  c'd  V  c'dd  V  c'dd  V  e'/V  c'df  V  c'#V  dd/= 

c'd  V  c'd  V  e'/V  d4f. 


and  the  second  is 


Basic  algebraic  properties 


115 


a'b  V  (ce)'c'd  V  ( ce)'e'fV  ( ce)'dfV  bc'd  V  be'fvbdf  = 
a'b  V  c'd  V  e'c'd  V  c'e'/V  e'/V  c' df  V  e' df  Vbc' d  V  be'fS  bdf  = 

a'bVc'dVe'fVbdf. 

(9)  (alb)"  =  (a' lb)'  =  (a"|&)  =  (a|b).  a 

None  of  the  properties  (1)  -  (9)  above  involve  interactions  between  the  various 
operations.  We  will  address  those  properties  momentarily.  First,  R  |i?  has  some  special 
elements,  (0|I)  and  (1 12),  which  act  as  a  "zero"  and  "one"  should  act  In  addition,  there 
is  the  "indeterminate"  element  (0|0)  =  (I  |0)  =  R. 

Theorem  2.  The  elements  (0\1),  (I|i)  and  (0|0)  satisfy  the  following  properties. 

(1)  (a\b)  +  (0 1 i)  =  (a\b)  ((0 { 1)  is  an  additive  identity); 

(2)  (a\b)  A  (1\1)  =  (a | b)  ((1 1 1)  is  a  multiplicative  identity); 

(3)  (a|b)  A  (0|I)  =  (0|i); 

(4)  (0\1)' =  (1\1); 

(5)  (1\1)'  =  (0| i); 

(6)  (a\b)  =  ab  V  (b'  A(0|0)); 

(7)  The  unique  element  ( a\b )  in  R such  that  (a\b)'  =  (a\b)  is  (0|0). 

Proof.  Again,  these  properties  are  straightforward.  For  example, 

(a|b)  A  (i  i 2)  =  (a\(a'b  V  0  V  b))  =  (a\b), 
and 

(a\b)  A  (0|2)  =  (0!(a'b  V  1  V  b))  =  (0 1 2).  □ 

Note  that  there  are  no  other  additive  or  multiplicative  identities  other  than  (0 1 2) 
and  (1 1 1).  If  x  and  y  were  two  additive  identities,  then  x  +  y  -  x  =  y,  and  similarly 
for  multiplicative  identities. 

The  following  theorem  provides  some  connections  between  the  various  operations. 
They  are  fundamental  ones. 

Theorem  3.  The  following  hold  in  R  |.R. 

(1)  (a\b)  A  (c\d  V  e\f)  =  ((a\b)  A  (c\d))  V  ((a\b)  A  (e\ f))  (A  distributes  over  V); 
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(2)  (a  |  ft)  V  ((c  |  d)  A  (el/))  =  ((a\b)  V  (c|d))  A  ((a|ft)  V  (el/))  (V  distributes  over  A); 

(3)  ((a|ft)V(c|4»'  =  (ti|  A(c|d)'; 

(4)  ((a|ft)  A(c|d))'  =  (a  |  ft) '  V  (c | d) '  ((3)  and  ( 4 )  are DeMorgan’s  laws). 

Proof.  We  will  prove  (1)  and  (3).  Then  we  will  see  that  (2)  and  (4)  arc  immediate 
consequences  of  (1)  and  (3).  For  (1), 

(a|ft)  A  (c|dVeJ/)  -  (a|ft)  A(c  Ve)|(cdVefV40  = 

a(c  V  e)|(a'ft  V  c'e'(cd  V  efV  df)S  ft(cd  v  efV  df)  = 

a(c  V  e)  | (a'b  V  c'e'df  V  ted  V  bef  V  ft#). 

Now 

((a|ft)  A  (c|d))  V  ((a|ft)  A  (el/))  =  (acj(a'ft  V  c'dVftd))  V  (ae|(a'ft  V  e'fsbf))  - 
(a(c  V  e) |  (acbd  V  aebf  V  a'b  V  c'de'f  V  c'dft/  V  Me'/  V  bdf)  - 
(a(c  V  e)  |  ( abed  V  aft#  V  a'b  V  c'de'f  V  ft#). 

Thus,  we  need  to  show  that 

a'b  V  c'e'dfV  bed  V  bef  V  bdf  =  afted  V  abefV  a'b  V  c'de'f  V  bdf. 

Clearly,  the  left  side  contains  the  right  But  since  afted  V  u'J  contains  bed,  and 
abef  V  a'ft  contains  bef,  the  right  side  contains  the  left  To  prove  (3),  note  that 

((a\b)  V  (c|d))'  =  ((a  V  c)|(aft  V  cd  V  ftd))'  =  (aV |(aft  V  cd  V  ftd)), 
and 

(tf|ft)'  A  (c[d)'  =  (a'c'|(aft''cdvftd)). 

The  equations 

((a | *)  V  (c|d  A  el/))  =  ((a|ft)  V  (c|d  A  e(/))" 

=  ((«!*)'  A  (c|d  A  e[/)')'  =  ((a | ft)  V  (c|d))  A  ((a|ft)  V  (e| /)), 


and 


C(a|ft)  A  (c|d))'  =  ((a|ft)"  A  (c|d)")'  = 
((a|ft)'  V  (c|d)')"  =  (a|ft)'  V  (c|d)', 
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establish  (2)  and  (4).  □ 

We  now  come  to  some  negative  aspects  of  R\R.  These  center  around  the 
operations  +  and  First,  R  j  R  does  not  have  negatives  (or  additive  inverses).  That  is, 
given  ( a\b )  in  R\R,  there  does  not  necessarily  exist  an  element  (c|d)  such  that 
(a\b)  +  (c\d)  =  (<9  j 7).  Indeed,  if  so,  then 

(a\b)  +  (c\d)  =  «a  +  c)\bd)  =  (0\l), 

whence  bd  =  1,  so  b  - 1  and  d  =  1.  In  that  case,  (a\b)  =  (a\l),  and  it  is  its  own 
negative.  Thus  the  elements  in  R\R  with  negatives  are  precisely  those  of  the  form 
(a|i).  In  R,  a  +  a  =  0.  This  does  not  cany  over  to  R\R.  For  example  (a\b)  +  (a\b)  = 
(0 1 b),  which  is  not  (0|i)  unless  b  =  l.  That  is,  R\R  is  not  of  characteristic  2. 

Secondly,  A  does  not  distribute  over  +.  A  simple  example  is  this: 

aiww  +  u\f))*v\w\d)  +  ai  w\f), 

the  first  being  ( 0\dj)  and  the  second  being  ( 0\bdf ). 

Thirdly,  '  is  not  a  true  complement  for  R\R.  That  is,  (a\b)  V  (a\b)'  *  (2  |i).  In 

fact, 

(a\b)  V  la\bY  =  {a\b)  V  ( a ' \b)  =  (l\(ab  V  a'b  V  b ))  =  (7 \{a  V  b)), 
which  is  not  (1  \  1)  unless  a  V  b-  1.  Also, 

(a\b)  A  (a\b)'  =  (0\((a'b  V  ab  V  b)))  =  ( 0\b )  ^  (0\1) 


unless  b  =  1. 

These  negative  aspects  of  R\R  are  summed  up  in  the  following  theorem.  In 
particular,# \R  is  far  from  being  a  Boolean  ring. 

Theorem  4.  The  following  hold : 

CO  #|#  is  not  a  group  under  +  ;  specifically,  not  every  element  has  a  negative; 

(2)  A  does  not  distribute  over  +  ; 

(3)  '  is  not  a  complementation  operator  on  #|#;  specifically,  (a\b)  V  (a\b)'  is 
not  necessarily  ( 1 1 1),  and  ( a\b )  h(a\b)'  is  not  necessarily  (0\1). 

(4)  R  |  R  is  not  of  characteristic  2,  that  is,  (a  |  b)  +  (a  |  b)  is  not  necessarily  ( 0\1 ). 
In  a  Boolean  ring,  the  four  basic  operations,  V,  A,  +,  and  '  are  not  independent. 
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For  example,  a  +  b  =  a'b  V  ab\  which  corresponds  to  the  "exclusive  or".  This  relation 
also  holds  in  R\R  as  we  have  noted  back  in  Section  3.2  Thus,  properties  of  +  in  R\R 
are  reflections  of  properties  of  V,  A,  and  and  We  have  just  noted  some  negative 
properties  of  +,  indicating  that  it  will  be  of  at  best  limited  importance  and  interest  in 
R\R.  For  these  reasons,  and  because  V,  A,  and  '  are  more  conceptually  fundamental 
operations  in  logic  and  probability,  we  will  drop  the  operation  +  from  our  considerations. 

In  Section  3.3,  an  order  relation  on  R  |/?  was  introduced  This  order  relation  will 
now  be  examined  in  some  detail.  We  first  establish  the  appropriate  language  in  which  to 
discuss  this  topic.  A  good  reference  for  the  following  material  is  Gratzer  (1978). 

Definition  1.  A  partially  ordered  set  is  a  set  L  with  a  relation  £  on  L  such  that  for  all 
x.y.ze  L, 

( 7 )  x<,x  (£  is  reflexive); 

(2)  if  x  <,  y  and  y£x,  then  x  =  y  ( £  is  anti-symmetric); 

(3)  ifx  <1  y  and  y£z,  then  x<,z  {<,  is  transitive). 

This  partially  ordered  set  is  denoted  (£.,  £),  or  just  L  if  there  is  no  confusion  as  to 
the  partial  order  under  consideration.  Let  (L,  :S)  be  a  partially  ordered  set,  and  let  S  be 
a  subset  of  L.  The  element  x  is  an  upper  bound  of  S  if  s  £  x  for  every  s  e  S.  The 
element  x  is  a  least  upper  bound,  or  supremum,  or  simply  sup,  of  S  if  x  is  an  upper 
bound  and  x  <,  y  for  any  upper  bound  y  of  S.  Lower  bounds,  and  greatest  lower 
bounds,  or  infima,  or  inf  are  defined  analogously. 

Definition  2.  A  lattice  is  a  partially  ordered  set  L  such  that  every  pair  {a,b}  of  elements 
of  L  has  a  sup  and  an  inf. 

The  sup  of  [af>]  is  usually  denoted  a  V  b  and  the  inf  by  a  A  b.  Note  that  this 
makes  sense,  namely  that  {a,b}  has  only  one  sup.  If  x  and  y  were  both  sups,  then  x  < 
y  since  x£  any  element  y  which  is  £  all  elements  in  { a,b }.  Thus  x  =  y.  Similarly 

infs  are  unique.  Now  V  and  A  are  two  binary  operations  on  L,  and  they  satisfy  the 

following  conditions. 

(1)  x  V  x  =  x  and  x  A  x  =  x  (V  and  A  are  idempotent); 

(2)  xV  y  =y  V  x  and  x  hy  =  yhx  (V  and  A  are  commutative); 

(3)  (x  V  y)  V  z  =  x  V  (y  V  z)  and  (x  A  y)  A  z  =  x  A  (y  A  z)  (V  and  A  are  associative). 

(4)  x  V  (x  A  y)  =  x  and  x  A  (x  V  y)  =  x  (V  and  A  satisfy  the  absorption  identities). 
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Property  (1)  follows  from  reflexivity  and  anti-symmetry,  and  (2)  and  (3)  directly 
from  the  definitions  of  sups  and  infs.  To  get  x  V  (x  A  y)  =  x  in  (4),  note  that  x  £  *  by 
-eflexivity,  and  x  £  x  A  y  by  definition  of  x  A  y,  so  x  is  an  upper  bound  of  x  and 
x  A  y.  If  z  is  another  such  upper  bound  and  z  £  the  upper  bound  x,  then  since  z  £  x, 
z-x.  The  other  part  of  (4)  follows  similarly.  Th-  following  two  properties  should  also 
be  noted. 

(5)  x >  x  Ay  and  x  £  x  V y, 

(6)  x  =  x  Vy  if  and  only  if  y=xAy. 

Property  (5)  is  immediate  from  the  definitions  of  upper  and  lower  bounds,  and  (6)  is 
a  consequence  of  (4).  For  example,  if  x  =  x  V  y,  then  by  (4),  y  A  (x  V  y)  =  y  =  y  A  x.  The 
other  half  of  (6)  follows  similarly.  Actually,  the  absorption  identities  imply  that  V  and  A 
are  idempotent,  but  we  will  not  concern  ourselves  with  such  technical  niceties  here. 

We  provide  the  following  theorem  and  its  proof,  since  it  will  hold  for  our  R\R,  and 
the  proof  in  general  is  as  easy  as  for  the  special  case  of  R\R.  We  have  already  noted  its 
converse. 

'iiieorem  5.  If  L  is  a  non-empty  set  with  two  binary  operations  V  and  A  which  satisfy 
( I)-{4 )  above,  then  L  is  a  lattice  under  the  partial  order  given  by  x<,y  if  x  =  x  Ay. 

Proof.  Erst  we  get  £  to  be  a  partial  order  on  L.  x£x  since  xhx  =  x.  If  x<.y 
and  y  £  z,  then 


x  ,\  z  =  (j  hy)  h  z  =  x  h  (y  h  z)  =  x  hy  =  x, 

so  x<,z.  If  x<,y  and  y  < x,  then  x  =  x  hy  and  y  =  y  A x,  so  x  =  y.  Now  we  show 
that  x  A  y  is  the  inf  of  {x,y}  and  x  V  y  is  the  sup  of  {x,y}.  x  A  (x  A  y)  =  x  A  y,  so 
x  A  y  £  x,  and  similarly  xAy^y,  so  x  A  y  is  a  lower  bound  of  {x,y}.  If  z  £  x  and 
z  <  y,  then 


xAz=z=yAz=yAxAz 

so  z<lX  hy.  Thus  x  hy  is  the  inf  of  { x,y }.  By  one  of  the  absorption  laws,  x  A  (x  V  y) 
=  x,  so  x  <  x  V  y,  and  similarly  y^xVy.  Ifx<z  and  y  <  z,  then  x  =  x  A  z  and 
y  =  y  A  z.  In  turn,  z  =  zVx  =  zVy,  implying 

z  =  z  V  z  =  (z  V  ,r)  V  (z  V  y)  =  z  V  (x  V  y), 

implying  x  V  y  £  z,  that  is  that  x  V  y  is  the  sup  of  {x,y},  and  (L,  <)  is  a  lattice.  □ 
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Thus  we  have  the  following  situation.  If  (L,  £)  is  a  lattice,  then  (L,  V,  A)  satisfies 
properties  (1)  -  (4)  above,  where  V  and  A  are  defined  by  aVb  =  sup{a,h}  and 
ahb  -  inf[a,b).  Conversely,  if  L(V,  A)  satisfies  (1)  -  (4)  above,  then  (L,  £)  is  a  lattice, 
where  £  is  defined  by  a  £  b  if  a  =  aAb.  Such  an  algebra  (L,  V,  A)  is  also  called  a  lattice. 
Thus  a  lattice  (L,  £)  yields  an  algebra  (L,  V,  A)  satisfying  (1)  -  (4)  above,  and  an  algebra 
(L,  V,  A)  satisfying  (1)  -  (4)  yields  a  lattice  (L,  £).  A  critical  fact  is  that  these  procedures 
are  reciprocals  of  each  other.  Thus  the  concept  of  lattice,  and  the  concept  of  an  algebra 
with  two  binary  operations  satisfying  (1)  -  (4)  are  the  same.  We  refer  the  reader  to 
Gratzer  (1978)  for  details. 

Now  back  to  R |i?.  The  two  operations  V  and  KonR\R  do  indeed  satisfy  (1)  through 
(4)  above.  We  have  already  observed  that  (i),  (2),  and  (3),  hold.  For  (4), 

(a\b)  V  ((a \b)  h(c\d)) 

=  (a\b)na\b))S((a\b)Hc\d)) 

=  (a\b)  A  (aVc|(ah  V  cd  V  bdj) 

=  (a\(a'b  V  (aVc)'A(ah  V  cd  V  bd)  V  biflb  V  cd  V  bd)) 

=  {a\{{a'b  V  a'c'bd  V  ah  V  hcd  V  W)) 

=  (u|((a'h  V  ah  V  hd))  =  (a]h). 

The  other  absorption  law  follows  similarly.  Thus  by  Theorem  5,  (R\R,<)  is  a  lattice.  In 
considerations  of  R\R,  emphasis  is  usually  more  on  V  and  A  than  on  the  former 
being  the  more  fundamental  concepts  for  us.  Thus  we  prefer  the  following  statement. 

Theorem  6.  (R  |/?,  V,  A )  is  a  lattice. 

If  L  is  a  lattice  and  L  itself  has  a  sup  and  an  inf,  then  that  sup  is  denoted  1  and 
that  inf  is  denoted  0.  In  that  case,  L  is  called  a  lattice  with  0  and  I,  or  a  bounded 
lattice.  Note  now  that  R\R  is  a  bounded  lattice.  The  1  is  the  element  (i |i)  and  the 
0  is  the  element  (0|/).  To  see  this,  recall  our  criteria  that  (a  |  b)  ^  (c|d),  namely  that 
ac<cd  and  c'd<,a'b.  Thus  (a\b)  Z  (1\1)  since  ab  <,  1  and  0<,a'b.  Thus  (1\1) 

is  the  1  of  the  lattice  R\R.  Similarly  (0\1)  is  the  0  of  it,  and  R\R  is  indeed  a 

bounded  lattice. 

A  lattice  is  called  distributive  if  the  following  conditions  hold. 

r  A  (y  V  z)  =  (r  Ay)  V  (r  A  z); 

x  V  (y  A  z)  =  (x  V  y)  A  (x  V  z). 

We  have  seen  that  these  distributive  laws  do  hold  in  R\R,  so  we  have  the  following 
theorem. 
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Theorem  7.  (R\R,  V,  A)  is  a  bounded  distributive  lattice.  The  0  is  the  element  (0 1 1), 
and  the  1  is  the  element  (7  1 2). 

In  a  bounded  lattice,  an  element  x  is  a  complement  of  y  if  x  hy  =  0  and  x  V  y  = 
7.  Complements,  if  they  exist,  are  unique  in  bounded  distributive  lattices.  The 
complement  of  x  is  usually  denoted  x'.  Note  that  x"  =x.  A  bounded  lattice  in  which 
every  element  has  a  complement  is  called  a  complemented  lattice.  A  complemented 
distributive  lattice  satisfies  DeMorgan's  laws: 

(x  Vy)'  =  x'  A y'; 

(XA  y)'=x'Vy\ 

The  details  may  be  found  in  Gratzer  (1978),  and  we  will  not  pursue  them  there,  mainly 
because  R\R,  our  lattice  of  interest  is  not  complemented.  Were  it  complemented,  then 
the  complement  (c\d)  of  (0\b)  would  have  the  property  that 

(c\d)  V  (0\b)  =  ( cd\d)  V  (0\b)  =  (2 |2)  =  (cd\cd  V  bd), 

whence  cd  =  1  =  c  =  d.  But  the  complement  of  (7 17)  must  then  be  (0\b),  but  is  (0|7) 
instead.  Thus  no  element  of  the  form  (0|b)  can  have  a  complement  unless  b  -  1.  In 
particular,  our  operator  '  on  R\R  is  not  a  complementation  operator.  There  does  not 
exist  a  complementation  operator  on  R\R  with  respect  to  V  and  A. 

There  is  a  weaker  notion  than  complement  In  a  bounded  lattice,  an  element  x*  is 
a  pseudocomplement  ofxifxVy*  =  0,  andifxAy  =  0  implies  that  y  £  x*.  An  element 
can  have  at  most  one  pseudocomplement;  if  a  and  b  are  pseudocomplements  of  x,  then  a  <, 
b  and  b  £  a,  so  a  =  b.  Thus  a  pseudocomplement  of  an  element  x  is  that  unique  largest 
element  whose  intersection  with  x  is  0.  A  pseudocomplemented  lattice  is  one  in  which 
every  element  has  a  pseudocomplement. 

Definition  3.  A  Stone  algebra  L  is  a  distributive  pseudocomplemented  ( bounded)  lattice 
which  satisfies  Stone’s  identity:  for  all  a  e  L, 

a*  V  a **  =  7. 

It  is  a  fact  that  in  any  Stone  algebra,  the  pseudocomplementation  operator  * 
satisfies  DeMorgan's  laws  (Gratzer,  pages  113,  119).  A  crucial  fact  is  that  R\R  is  a  Stone 
algebra,  and  this  is  not  entirely  obvious. 

Theorem  8.  (R  |i?,  V,  A)  is  a  Stone  algebra.  The  pseudocomplement  (a\by  of  an  element 
(a\b)  is  (a'b\l)  that  is  a'b.  DeMorgan's  laws  hold  for  *: 
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((a | £>)  V  (cjd))*  =  (a\b)*  A  (c|d)*; 

((a\b)  A  (cjd))*  =  (a\bY  V  (cjd)*; 

Proof.  First,  we  show  that  (a  j  b)*  =  {a' b  { 2). 

(fl|2»)  A  (a'2>|2)  =  V  a%'  V  b)  =  (0|2). 

If  (c|d)  A(a|Z>)  =  (0j2),  then  (cuj(c'd  V  V  bd) )  •=  (0|2).  Now,  from  the 
characterization  of  £  on  R\R,  (c|d)  £  ( a'b\l )  if  and  only  if  cd  <,  a'b  and  avfc'  £  c'd. 
We  have  c'd  V  a'b  V  bd  =  2  and  ac  =  0.  From  the  equation 

c'd  V  a'b  V  fed  =  2, 

by  first  conjoining  with  b'  and  then  separately  with  d'  we  get  b'<,  c'd,  and  d'  £  a'b. 
Thus  b'  £  c',  aW  <,  d,  and  since  ac  =  0,  also  a  £  c'.  We  have  then  that  aVb'  £  c'd.  It 
remains  to  get  cd  ^  a'b:  But,  from  ac  —  0  we  have  c  £  a  and  from  b  ^  c  we  have  c  — 

b.  Thus  cd  £  a'b,  and  so  (a|b)*  =  (a'b|2). 

DeMorgan’s  laws  can  be  verified  easily  now  that  we  have  an  explicit  formula  for  *  . 

((a\b)  Y  (c|d))*  =  ((aVc|(ab  V  cd  V  bd))*  =  ((a'c')(ab  V  cd  V  bd)\l)  =  (a'c'M|2), 
and 

(a|b)*  A  (c|d)*  =  (a'b|2)  A  (c'd|2)  =  (a'c'bd\l). 

The  other  part  of  DeMorgan’s  laws  follows  similarly.  D 

Remark.  Thus  we  have  that  2?|2?  has  a  rather  rich  algebraic  structure,  being  a 
pseudo-complemented  distributive  lattice,  in  fact,  a  Stone  algebra.  However,  it  is  not 
complemented,  that  is,  does  not  have  an  operator  #  on  it  such  that  a0  A  a  =  0  and 
a°  v  a  =  2  for  all  a.  This  situation  is  somewhat  different  from  that  of  quantum  logic. 
Indeed,  a  space  of  quantum  events  is  a  collection  of  closed  subspaces  of  some  complex 
Hilbert  space  and  its  algebraic  structure  is  also  that  of  a  lattice,  but  of  a  non-distributive 
yet  complemented  one.  (See  Gudder,  1988.)  As  pointed  out  in  Section  3.5,  the  truth  table 
of  the  pseudocomplementation  operator  *  on  R\R  is  the  truth  table  of  Heytings 
negation  operator  in  his  three-valued  logic. 

The  structure  R\R  is  one  generalization  of  Boolean  algebras.  It  is  a  special  kind  of 
Stone  algebra,  and  will  be  so  characterized  in  the  next  section.  But  it  can  be  viewed  other 
ways,  depending  on  which  operations  on  R\R  to  investigate.  For  example,  looking  at 
other  operations  on  R | R  in  conjunction  with  our  V  and  A  makes  R\R  into  a  semi-simple 
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MV  algebra,  and  this  will  be  discussed  with  Section  4.3,  with  an  attendant  Stone 
representation  theorem. 

4.2  An  abstraction  of  the  space  of  conditional  events 

Section  4.1  culminated  with  the  theorem  that  R  |i?  is  a  Stone  algebra.  One  way  to 
give  an  algebraic  characterization  of  is  to  identify  it  among  Stone  algebras.  That  is 
what  we  will  do.  Thus  we  need  to  determine  just  what  conditions  on  Stone  algebras  make 
them  precisely  of  the  form  R\R.  For  such  a  characterization  to  be  a  good  one.  the 
conditions  added  should  be  succinct  and  conceptionally  pleasing,  involving  fundamental 
entities  associated  with  Stone  algebras.  Two  such  entities  are  in  the  following  definition. 

Definition  1.  l^et  L  be  a  Stone  algebra  and  *  its  pseudo-complementation  operator.  The 
skeleton  of  L  is  the  set  Li”  =  [a* :  a  eL}.  The  dense  set  ofL  is  the  kernel  of* 

D(L)  =  {a  :  a  e  L,  a*  =  0). 

We  need  a  number  of  properties  of  V  and  D  =  D(L).  A  more  complete  discussion 
may  be  found  in  Gratzer  (1978). 

Theorem  1.  Let  L  be  a  Stone  algebra.  The  following  hold: 

C I )  ala**; 

(2)  alb  implies  that  a*  £  b*; 

(3)  a*  =  a***; 

(4)  a  eL*  if  and  only  if  a  =  a**; 

(J)  (a  A  b)*  =  a*  V  b*; 

(6)  (a  V  b)*  =  a*  A  b*; 

Proof.  (1)  and  (2)  follow  immediately  from  the  definition  of  pseudocomplement. 
(1)  and  (2)  imply  that  a*  >  a***,  and  (1)  applied  to  a*  yields  a*  1  a***.  Thus  (3)  holds.  If 
a  e  L*,  then  a  =  b*,  so  a**  =  b***  =  b*  =  a.  If  a  =  a**,  then  a  =  (a*)*,  whence  a  e  L*,  so 
(4)  holds. 

To  prove  (5),  we  have 

(a  hb)  A  (u*  V  b*)  =  (a  hb  A  c*)  V  (a  hb  A  £>’)  =  0  V  0  =  0. 

If  (a  A  b)  A  x  =  0,  then  (b  A  x)  A  a  =  0,  so  that  (6  A  x)  1  a*.  Thus 
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(bKx)  Ka”£a*  Ka”  =  0, 

so  (x  K  a**)  Kb  =  0,  implying  that  x  A  c**  £  b*.  By  the  Stone  identity,  a *  V  a**  =  7,  and 
thus 

x  =  xhl=xK  (a*  y  a M)  =  (xAa*)V(xA  a**)  £  a*  V  6*. 

Tlius  (5)  is  proved. 

To  prove  (6), 

(ay  b)  K  (a*  A  &*)  =  (a  K  a*  K  b*)  y  (b  K  a*  K  b*)  =  0  y  0  =  0. 

If  x  A  (a  V  7?)  =  0,  then x^(aV  &)*.  But  a  V  b  £  a  implies  that  (a  V  &)*  £  a*,  so x £  a*, 
and  similarly  r  £  &*,  whence  x<,a*  K  b *.  Thus  (a  V  6)*  =  a*  A  6*,  and  (6)  is  proved.  o 

The  properties  in  the  theorem  yield  the  following  fundamental  facts  about  L*  and  D. 

Theorem  2.  Let  L  be  a  Stone  algebra.  Then 

(1)  L*  is  a  Boolean  algebra  whose  0  and  1  are  those  of  L; 

(2)  D  is  a  filter  (dual  ideal)  and  1  zD.  In  particular,  D  is  a  distributive  lattice 
with  1. 

Proof.  Clearly  0*  =  1  and  lv  -  0,  so  that  0  and  1  are  in  L*.  From  Theorem  1, 
(a*  A  b*)  =  (ay  by  and  (a*  V  b*)  =  (a  A  b)*,  so  that  L*  is  a  sublattice  of  L.  Since  a *  V  a** 
=  1,  *  is  a  complementation  operator  on  L*.  Thus  L*  is  a  Boolean  algebra. 

If  a,  b  €  D,  then  (a  V  bf  -  a*  A  b*  =  0  KO  =  0  and  (a  Kb)*  =  a*  V  b*  =  0,  so  D  is  a 
sublattice.  If  a  e  D,  then  for  all  x  e  L,  (a  y  x)*  =  a*  K  x*  =  0  K  x*  -  0,  whence  D  is  a 
filter.  Since  7*  =  0, 1  e  D.  □ 

Now  we  turn  to  R\R,  identify  its  skeleton  and  dense  set,  and  note  some  of  their 
special  properties.  Recall  that  the  pseudocomplementation  operator  *  on  R\R  is  given 
by  (a\by  =  (a'b\l).  Thus  it  is  clear  that  (7?|7?)*  =  {oji  :  a  e  R),  which  we  denote  by 
7?|7.  For  (a\b)*  to  be  (0j 7),  we  must  have  (a'b\l)  =  (0(7),  so  a'b  =  0.  Thus  b  <,  a, 
so  (a\b)  =  (b\b)  =  (7 1 b).  It  follows  that  D(R\R)  =  {l]b  :  b  e  R),  which  we  denote  by 
7 1 R.  Thus  we  have  the  following  theorem. 

Theorem  3.  The  skeleton  of  R\R  is  (7?|7?)*  =  {(fl|7)  :  a  z  R]  =  R\1  =  R,  and  the  dense 
set  ofR\R  is  D(R\R)  =  {(7|a) ;  a  z  R)  =  ((a\a) :  a  z  R}  =  1\R. 


Both  7?|7  and  7 1 7?  are  copies  of  R.  In  fact,  the  elements  of  7?|7  are  identified  with 
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the  (unconditional)  events  of  R.  The  mapping  R\1  -*  2|2?  :  (a\l)  (l\a)  is  clearly  a 

bijection.  Since 

(2  j  a)  V  (2 1  h)  =  (2 1  (a  V  b))  and  (l\a)  A  (l\b)  =  (2|(a  A  b)\ 

that  mapping  preserves  V  and  A.  Further,  (0\1)  and  (2 1 2)  of  R\1  go  to  (2|0)  and  (2 12), 
respectively  of  2  \R,  and  (a|2)'  =  {a'  |2)  goes  to  (l\af).  ThusJ  |2?  is  a  Boolean  algebra 
with  its  0  and  2  the  elements  (2|0)  and  (2 1 2),  respectively,  and  with  (2|a)'  =  (l\a'). 
Thus  the  dense  set  of  R  |22  is  also  a  Boolean  algebra,  and  is  isomorphic  to  the  skeleton  of 
R\R.  Since  2|i?  is  a  Boolean  algebra,  it  has  an  operation  +  given  by  x  +  y  =  x'y  V  xy' 
making  it  into  a  Boolean  ring.  This  +  is  not  the  +  inherited  from  R\R  since  the 
complementation  operation  '  on  2 1 R  is  not  the  restriction  of  the  complementation  '  on 
i?|2?. 

Now  suppose  that  L  is  a  Stone  algebra,  and  it  is  known  that  its  dense  set  D  is  a 
Boolean  algebra  isomorphic  to  its  skeleton  L\  There  is  no  obvious  way  to  effect  this 
isomorphism.  However,  since  D  is  a  filter,  a  V  x  is  in  D  for  any  a  e  L  and  any  x  £  D.  The 
mapping  a  -»  a  V  x  is  a  homomorphism  from  L  into  D,  and  in  particular  from  L*  into  D. 
Just  observe  that  (a  V  x)(b  Y  x)  =  ab  V  x  so  that  the  mapping  preserves  V,  and  similarly  it 
preserves  A.  If  D  is  Boolean,  or  more  generally,  if  D  is  a  lattice  and  thus  has  a  0,  say  & 
then  that  is  a  natural  element  to  pick  in  hopes  of  yielding  an  isomorphism  between  L*  and 
D.  In  R\R,  the  element  Q  is  (2|0)  =  (0|0)  as  noted  above,  and  indeed  the  mapping 
(a  1 2)  -♦  (a  1 2)  V  (2 1 0)  =  (a  |  a)  =  (2 1  a)  effects  the  isomorphism  already  noted  between  R  \  1 
and  2 1 R. 

We  sum  up. 

Theorem  4.  In  R\R,  the  skeleton  i?|2  and  the  dense  set  1\R  are  Boolean  algebras, 
and  the  mapping  (a  1 2)  -♦  (a  1 2)  V  (2 1 0)  is  an  isomorphism  between  them. 

It  turns  out  that  the  conditions  expressed  in  Theorem  4,  namely  that  the  skeleton 
and  the  dense  set  are  Boolean  algebras,  and  the  mapping  a  -*  a  V  Q  is  an  isomorphism 
between  these  two  Boolean  algebras,  characterize  R\R  among  Stone  algebras.  This  is 
made  precise  in  the  following  theorem. 

Theorem  5.  Let  L  be  a  Stone  algebra,  L*  its  skeleton,  and  D  its  dense  set.  Suppose  that  D 
is  a  Boolean  algebra,  and  that  the  mapping  a  -*  a  V  0  is  an  isomorphism  from  L*  to  D, 
where  Q  is  the  0  of  D.  Then,  the  mapping  <p  :  L  ->  D  \D  :  a  -» ((a  V  0)|  (a  V  a*))  is  an 
isomorphism. 
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Proof.  First,  note  that  <p  is  indeed  a  mapping  from  L  into  D\D.  a  V  Q  is  in  D  since 

D  is  a  dual  ideal  and  Q  is  in  D.  (a  V  a*)*  =  a*  A  a**  =  0,  so  a  V  a*  is  also  in  Z>.  We  now 

break  the  proof  up  into  several  steps. 

(1)  (p  is  one-to-one. 

Suppose  that  (a  V  ®  j  (a  V  a*)  =  (b  V  Q) | (b  V  b*).  Then 

(a  V  ©  A  (a  V  a*)  =  (b  V  Q)  A  (b  V  b*)  =  a  V  (G  A  (a  V  a*))  =  a  V  Q  =  b  V  ft 

We  also  have  fl  V  a*  =  b  V  b*,  so  that  =  Multiplying 

through  by  a,  we  get  a  A  (a  V  a*)  =  a  A  (b  V  a*)  =  a  =  ab,  and  by  symmetry,  b  =  ab,  so  a  = 
b.  Thus  9  is  one-to-one. 

(2)  <p  preserves  V. 

<p(a)  V  <p(b)  =  (a  V  |  (a  V  a*)  V  (b  V  fl)  |  (b  V  b*)  = 

(a  V  b  V  0|/((a  Vfi)  A  (a  V  a*))  V  ((b  V  fi)  A  (b  V  b*))  V  ((a  V  a*)  A  (b  V  b*))7  = 

(a  V  b  V  £)|(a  V  Q  V  b  V  Q  V  (a  A  b)  V  (a*  A  b)  V  (a  A  b*)  V  (a*  A  b*))  = 

(a  V  b  V  fi)  |  (a  V  b  V  (a*  A  b*))  = 

(a  V  b  V  fl)|(a  V  b  V  (a  V  b)*)  =  <p(aV  b). 

Some  preliminaries  are  needed  before  showing  that  <p  preserves  A.  Since  A  in 
D|Z)  involves  the  complement  in  the  Boolean  algebra  D,  we  need  to  figure  out  what  it 
is.  We  have  the  isomorphism  a  -♦  a  V  Q  from  L*  to  D,  and  the  complement  operator  on  L* 

is  *  itself.  For  a  e  L,  let  xa  be  the  (unique)  element  in  L*  such  that  xa  V  Q  =  a  V  Q.  Thus 

$ 

the  complement  in  D,  which  we  will  denote  by  ',  is  given  by  (a  V  Q)'  =  V  £>.  This  is 
simply  because  the  mapping  a  -+  a  V  £  is  an  isomorphism  between  L*  and  D.  If  a  itself  is 
in  D,  then  a'  =-x*  V  Q. 

For  a  e  L,  it  turns  out  that  a  pertinent  question  for  us  is  the  relation  between  a*  and 

x  *.  Note  that  for  a  e  L,  a  =  a**A  (a  V  a*).  This  is  because  a**  2:  a  and  a**  A  a  =  0.  Thus 
a 

aVfl=  (a**  V  £)  A  (a  V  a*)  since  Q  A  (a  V  a*)  =  so  that 

a  V  Q  =  *a  V  =  (a**  V  Q)  A  (a  V  a*)  = 

(a**  V  fi)  A  (a  V  £  V  a*  V  £)  = 

(a**  V  (2)  A  (xfl  V  Q  V  a*  V  (2)  = 

(a”Axa)V£ 

Now  from  xaV  Q  =  (a**  A  xfl)  V  &  we  get  a**  Arfl  =  xfl,  so  that  a**  > 

a***  =  a*  <  x  *.  We  sum  up  these  facts. 
a 


In  particular, 
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(3)  For  a  e  L,  let  xa  be  the  element  in  L*  such  that  xQV  Q  =  aV  Q.  Then  the  complement 
operator  '  in  D  is  given  by  (a  V  0'  =  x*  V  Q,  and  a*  £ x*. 

(4)  tp  preserves  A. 

<p(a)  A  <p(b)  =  ((aV0|(a  V  a*))  A  ((b  V0l(b  V  b*))  = 

((<z  V  0  A  (b  V  0)|/((a  V  0'  A  (a  V  a*))  V  ((b  V  0'  A  (b  V  b*))  V  ((a  V  a*)  A  (b  V  b*))/  = 

((a  V  0  A(b  V  0)  |  m*  V0A(aV  a*))  V  ((r^  V  0  A  (b  V  b*))  Y  ((a  Y  a*)  A  (b  V  b*))7. 
Since 

V  0  A  (a  V  a*)  =  (x*  V  0  A  (a  V  Q  Y  a*  V  0  = 

((rfl*V0A  (a  V0)  V  f(xfl*V0Aa*  V0)  = 

Q.Vx*a\ 

and  since  a*  £  we  have  V  ,xfl*a*  =  a*  V  Q,  Thus 

<p(fl)  A  <p(b)  = 

((a  V  0A(b  V  0)  |  Ra*  V  Q  V  b*  V  fl  V  (a  A  b)  V  (a*  A  b)  V  (a  A  b*)  V  (a*  A  b*)J  = 

((a  V  0A(b  V  0)  |  (a*  V  b*  V  (a  A  b)) 

((a  A  b)  V  0 1  (a  A  b)*  V  (a  A  b))  = 

#a  A  b). 

To  complete  the  proof,  we  need  that  <p  is  onto.  Given  an  element  (a|b)  in  D\D, 
it  is  not  obvious  just  what  element  <p  takes  onto  it.  How  would  we  find  this  element  in 
the  case  L  were  R\R  itself?  In  that  case,  we  are  given  an  element  (2|a)|(2|b)  = 
(l\ab)\(l\b)  in  D(R\R)\D(R\R),  recalling  the  fact  that  7|i?  is  a  Booled  ring,  and 
need  the  element  (a|b)  =  (ab|b),  which  <p  does  indeed  take  to  (7|a)|(7[b).  There  is  one 
key  observation  to  be  made.  First  note  that  for  (a|b)  ini?|i?,  say,  (a|b)  =  (ab\b)  =  (ab\l) 
V  (0 1  b).  Now  consider  our  map  cp  as  applied  to2?|i?.  Then 

(a|b)  = 

((a|b  V  7|0)|(a|b  V  (a|b)“))  = 

((ab|2  V0|bVi|0)|(ab|/  V  0\b  V  (ab\l  V0|b)*))  = 
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Since 

(fl&|I)  V  (0|h)  V  Q\0)  =  (2{ ab)  =  (ab\l)  V  (2{ 0), 
and 

(ab\J)  V  (0|h)  V  (ch|2  V  0\by  =  (2|h)  =  (0|h)  V  (0jh)*, 

we  will  be  in  business  if  from  (2  J  ab)  |  (2 1  b)  we  can  construct,  in  a  Stone  algebra  satisfying 
our  hypotheses,  elements  corresponding  to  (ab\l)  and  (0| b).  So  we  know  the  elements 
(2  J  ab)  and  (2  |h),  and  so  know  cb  and  b.  The  element  corresponding  to  (ab\l)  is,  in  the 
notation  above,  ^atjy  The  element  corresponding  to  0\b  is  the  element  x£  A  Q,  since  in 
R\R,  (0|h)  =  (h|2)*  A  (2 1 0).  Now  this  dictates  that  given  the  element  (a|h)  in  D\D,  q> 
should  take  xaf^  V  (r^*  A  Q )  onto  iL  We  check: 

<xaA6vCVA®v®l <xatb v <*b  A ® v &W. v V A ®>*  = 

(x0a£>  v®Kxc/li,vfiv  (W  A  V  = 

(a  A  h)|(a  A  b  V  ((x^*  V  ©  A  (x&  V  0)  = 

(a  A  h)|(a  A  h)  V ((a  A 6)'  Ah)  = 

(o  A  h)  j((a  A  h)  V  (a'  V h')  Ah)  = 

(a  Ah)|((a  Ah)  V (a'  Ah))  = 

((a  A  h)  |  h). 


This  completes  the  proof.  D 

Several  comments  are  in  order.  First,  since  L*  is  isomorphic  to  D,  Lis  isomorphic 
to  L*|L*.  The  theorem  was  stated  using  D \D  since  the  isomorphism  from  L  into  D\D  is 
more  simply  and  elegantly  denned  than  the  one  from  L  into  U\U. 

Second,  in  the  statement  of  the  theorem,  one  need  oni>  assume  that  D  is  a  bounded 
lattice  and  that  a  -*  a  V  Q  is  a  one-to-one  mapping  from  If  onto  D.  Tnat  mapping  is  then 
automatically  an  isomorphism  since  V  and  A  are  preserved  in  any  case. 

In  R\R,  one  has  the  "complementation"  '  given  by  (ajh)'  =  (a'  |h).  No  mention  cr 
use  of  it  has  been  made  in  our  theorem.  In  the  Boolean  algebra  If,  *  is  the 
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complementation  operator.  In  a  Stone  algebra  satisfying  the  hypothesis  of  Theorem  5,  the 
Boolean  algebra  D  must  itself  also  have  a  complementation  operator,  and  we  identified  the 

complement  of  an  element  a  in  D  as  an  element  of  the  Boolean  algebra  D  to  be  the 
* 

element  xQ  V  &  In  a  Stone  algebra  satisfying  the  hypothesis  of  Theorem  5,  what  is  the 
operator  corresponding  to  '  in  R\R1  The  element  a  in  L  corresponds  to  (a  V  £)  |  (a  V  a*) 
inD|D,  and 


((a  V  ©i(n  V  a*))'  =  (a  V ©' \(a  V  a*)  =  (xfl*  V fi)|(a  V  a*). 

The  preimage  of  this  element  under  our  isomorphism  is  the  element 

*  *  *  V((r  *)’  Afl), 

(xfl  V  0)t\(aVa  )  aSa 

which  by  a  routine  calculation  is  a*  V  (xfl  A  a**  A  Q).  In  other  words,  for  a  in  L, 
a'  =  a*  V  (xfl*A  a**  Afl)  =  rfl*  A  (a*  Vj2)  =  a*  V  (xq*  A0. 

Definition  2.  An  abstract  conditional  space  is  a  Stone  algebra  L  such  that 
(2)  its  dense  set  D  is  a  bounded  lattice,  and 

(2)  the  mapping  from  its  skeleton  L*  to  D  given  by  a  -»  a  Y  ft  w/zere  Qis  the  0  ofD, 
is  a  bijection. 

Of  course,  D  is  also  a  Boolean  algebra.  An  alternate  way  to  phrase  this  definition  is 
to  require  that  D  is  a  Boolean  algebra  and  that  the  mapping  is  an  isomorphism.  That  is 
the  phraseology  in  Theorem  5.  The  conditions  in  Definition  2  are  not  really  weaker 
although  they  appear  to  be.  In  any  case,  an  abstract  conditional  space  is  just  R\R  for 
some  Boolean  ring  R. 

There  are  other  versions  of  Theorem  5  available  to  us.  In  R  |2?,  the  operator  '  plays 
a  significant  role,  as  does  the  special  element  (2 1 0).  One  can  arrive  at  a  representation 
theorem  by  postulating  these  two  entities  on  a  Stone  algebra  and  requiring  certain 
properties  of  them.  The  following  is  an  example  along  this  line. 

Theorem  6.  Let  L  be  a  Stone  algebra,  L*  its  skeleton  and  D  its  dense  set.  Suppose  that 
there  is  an  element  Q.  e  L  such  that  D  =  L*  V  Q,  and  that  there  is  a  unary  operator  '  on  L 
that  coincides  with  *  on  L*,  and  satisfies  (x  V  Q)'  =  x'  A  Q!  for  all  x  6  L*,  and  Q'*  =  0. 
Then  L  is  isomorphic  to  L*\L*. 

Proof.  If  x  e  L*,  then  (x  V  £)'*  =  (x'  A  £)')*=  x'*  V  Q'*  =  x  V  0  =  x,  so  the  map 
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L*  -*  D  :  x  -♦  x  V  Q  is  a  bijection.  Since  0  V  Q  =  £  is  in  its  image,  Q  e  D.  The  map 
L-*D  : x  x  V  d  for  any  d  e  D  preserves  V  and  A,  so  L*  and  D  are  isomorphic  Boolean 
algebras,  Q  is  the  zero  of  the  Boolean  algebra  D,  and  Theorem  5  .  applies. 

□ 

<*.3  Semi-simple  MV-algebras 

This  section  consists  of  proving  that  the  algebraic  system  R\R  can  be  enriched  in  a 
simple  way  to  obtain  a  (Chang)  MV-algebra,  so  that  a  formal  relationship  with  fuzzy 
logics  is  established  This  latter  fact  follows  from  Belluce  (1986). 

First,  as  stated  earlier,  in  view  of  three-valued  logic  connection,  R  |i?,  equipped  with 
any  given  system  of  basic  operators  on  it,  is  an  algebraic  structure  generalizing  boolean 
ring  structure.  This  generalization  can  be  viewed  in  various  different  ways,  depending 
upon  the  given  system  of  operators.  In  Section  4.1  we  have  seen  that  when  R\R  is 
equipped  with  our  operators  (A,  V,  (•)')*  then  R\R  is  a  special  type  of  a  Stone  algebra 
where  the  associated  pseudo-complementation  *  is 

(a\b)*  =  a'Kb  (=a'-&). 

In  a  (independent)  pioneering  work,  Schay  (1968)  took  the  equivalent  viewpoint  by 
modeling  conditional  events  as  generalized  three-valued  indicator  functions.  By  doing  so, 
he  considered  R\R  as  an  algebraic  structure  with  a  system  of  fiv'-  operators 
(a,  u.  A,  V,  (*)0  (where  his  A,  V,  (•)’  are  different  from  ours). 

Abstracting  this  algebraic  structure,  he  spent  almost  half  of  his  work  on  establishing 
a  Stone's  Representation  Theorem  for  his  new  structure  (Schay,  1968,  p.  338-342).  While 
the  mathematics  involved  is  interesting,  his  axioms  for  the  abstract  structure  are  quite 
complicated. 

In  another  direction,  motivated  by  the  desire  of  establishing  a  three-way  relationship 
among  formal  systems,  MV-algebras  and  fuzzy  sets  in  the  context  of  multi-valued  logics 
(as  an  analog  to  the  case  of  classical  two-valued  logic,  where  there  is  such  a  relationship 
among  formal  systems,  Boolean  rings  and  set  theory),  Belluce  (1986)  considered  a 
generalized  structure  known  as  Chang  MV-algebra.  This  algebraic  structure  is  known  in 
multi-valued  logics  (Chang,  1958,  1959).  Roughly  speaking,  such  a  structure  is  obtained 
when  the  idempotency  and  the  distributive  law  in  a  boolean  ring  R(+,  •)  are  both 
dropped. 

Specifically,  following  Belluce,  an  MV-algebra  is  a  non-empty  set  A  with  two 

binary  operators  +,  •  ,  and  one  unary  operator  ‘  with  0,  1  satisfying  the  following 
conditions. 
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(i)  <A,  +,  0>  and  <A,  •,  1>  are  commutative  semi-groups  with  identity. 

(ii)  For  all  x,y  z  A, 

x  +  x  -  1,  x-x  =  0,  x  =  0,  0  =  1. 

(iii)  For  all  x,y  e  A, 

(x  +  y)  =  x-y,  x7y  =  x  +  y,  x  =  x  .. 

(that  is,  ”  is  involutive,  and  of  "De-Morgan"  type  with  respect  to  +  and  •)• 

(iv)  +  and  •  are  such  that,  if  one  defines  two  "boolean-like"  operators 

x  V  y  =  x  +  xy,  x  A  y  =  (pc  +  y)y , 

then  <A,  V,  0>,  <A,  A,  1>  are  also  commutative  semi-group  with  identity. 

(v)  For  all  x,y,ze  A, 

x-(y  V  z)  =  x-y  Sx-z,  (x  +  y)  A  (x  +  z)  =  *  +  (y  A  z) . 

Notice  that  <4,  A,  V,  1,  0>  is  a  bounded  commutative  lattice  where  the  associated  order 
relation  £  is  x  £  y  if  and  only  if  x  A  y  =  x. 

Definition.  An  MV-algebra  A  is  said  to  be  archimedean  when  for  each  x,  ye  A,  if 
(pc  +  ...  +  x)  =  nx  <,y  for  all  n'Z.O,  then  x-y  =  x. 

A  result  in  Belluce  (1986)  stated  that  archimedean  MV-algebras  and  semi-simple 
MV-algebras  are  the  same. 

With  analogous  algebraic  concepts  for  MV-algebras,  a  MV-algebra  A  is  said  to  be 
semi-simple  if  its  radical  is  zero.  (See  Belluce,  1986,  for  details.)  The  point  is  this: 
semi-simple  (or  equivalently,  archimedean)  MV-algebras  are  precisely  "bold"  algebras  of 
fuzzy  sets  (Belluce,  1986,  Theorem  4),  where  by  a  "bold"  algebras  of  fuzzy  sets,  one 
means  a  subalgebra  of  the  MV-algebra  (under  induced  operations)  of  all  fuzzy  subsets  of 

some  space  £2,  that  is,  the  collection  of  all  functions  / :  £2  -*  [0, 1],  Specifically,  [0, 1]^ 
becomes  an  MV-algebra  with: 

(f+  g)(co)  =  Min(l,m  +  gm 
(f-g)(co)  =  Max(o,f(co)  +  g((o)  - 1) 

m  =  1  -m 

(fV  g)(co)  =  Max(f(co),  g(a>)) 

(f  A  g)(a>)  =  Minified),  g(co)) . 

We  proceed  now  to  show  that  R\R  can  be  viewed  as  an  archimedean  MV-algebra, 
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so  that  algebraically  speaking,  conditional  logic  in  a  sense  is  a  form  of  fuzzy  logic.  The 
search  for  operations  on  R\R  making  R\R  an  MV-algebra  is  dictated  by  the  operations 
+,  •  of  fuzzy  sets  or  of  corresponding  operations  on  [0,  2],  and  this  using  the  Theorem  1 
of  Section  3.4. 

In  the  three-valued  logic  setting,  viewing  u  as  "lying”  between  0  and  2,  say 
u  =  2/2,  we  can  treat  u  as  a  real  number  in  [0,  2].  In  this  vein,  consider  ©  and  o 
defined  on  [0,  2]  by 

x  ©  y  =  Min(l,  x  +  y), 
x  o  y  -  Max(0,  x  +  y  -  2) . 

2 

The  restrictions  of  ©  and  o  to  {0,  u,  1}  yield  values  in  {0,  u,  2),  and  hence 

2 

correspond  to  truth  functions  of  operations  on  R\R.  So  let  yr :  { 0 ,  2/2,  2}  -*  {0, 2/2,  2} 
be  defined  by 

yKi,  j)  =  Max{0,  i+j  - 1) . 

We  have 

vhn  =  (a,  /», 

</V)  =  {(0,  0),  (0,  i/2),  (i/2, 0),  (i/2,  i/2),  (0,  i),  (i,  0))  . 

Recall  that,  for  a,  b,  c,  de  R  with  a<>b,  c<d,  the  pair  (i,  f)  corresponds  to 
Wj(a  |  b)Wj(c  |  d),  where 

a'b  if  i  =  0 
wJ(a\b )  =  b'  if  i  =  2/2 

ab(=a)  if  i  =  1 

and 

fv:  (R\R)2  -<R\R 

is  determined  by 

*  -V 

fj(a\b),{c\d))=  V  j  w.(a\b)wic\d}\  .v  ,  w.(a\b)w.(c\d)  . 

r  [(iJ)zY  O)  J  QjW  (Wi  (0)  1  J 


V  ,  w.(a  |  b)wic  \d)-ac, 

(ij)eyrUl)  1  J 


Thus,  here 
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and 


-7V  -1  wMb)wXc\d) 

Mev'vm  m '  1 

=  acV  a'bc'd  V  a'bd'  V  b'c'd  V  b'd'  V  a'bc  V  ac'd 
=  acV  b'd'  V  a'b{c'd  V  d'  V  c)  V  c'd(h'  V  a) 

=  ac  V  h'd'  V  a'6  V  c'd(a'b)'> 

=  acV  b'd' V  a'bV  c'd , 


noting  that  since  c  £  d,  c'd  V  d'  V  c  =  7,  and  that  a'b  V  ( c'd)(a'b )'  =  a'b  V  c'd). 
Hence 

/  i((a|Z>),  (c|d))  =  (ac|ac  V  a'b  V  c'd  V  b'd') 

=  (a\b)(c\d)(b  \l  d\l) . 

Similarly,  for  y(i,  j)  =  Min(l,  i  +  y),  we  have 

r hi)  =  {(i/2,  i/2),  (7/2,  7),  (7,  7/2),  (7,  7),  (0,  7),  (7, 0)}, 

y  7(0)  =  {(0, 0)}, 

V  ;  wXa\b)wXc\d)  =  b'd'  V  &'c  V  ad'  V  ac  V  a'2?c  V  ac'd 

m-v  m  1 

=  2/d'  V  a(d'  V  c'd  V  c)  V  c(fc'  V  a'6) 

=  b'd'  V  a  V  c(a') 

=  a  V  c  V  fc'd', 
and 


;V  7  w.(a|h)w<c|d)  =  a  V  c  V  &'d'  V  a'bc'd 

QJWUim  (0)  1 

=  a  V  c  V  &'d'  V  bdia  V  c)' 

=  a  V  c  V  6'd'  V  bd 
=  (a|&)V.  (c|d)  V(7>'d'|7). 

This  suggests  the  following  new  operations  on  /? }/?: 
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0 a\b )  ©  (cjd)  =  (a\b)  V  (c|d)  V  (b'd'|2), 

(a\b)  o  (c\d)  =  (a\b)(c\d)(b  V  d\l) . 

Theorem  1.  (R\R,  ©,  o,  ' ,  (Oji),  (2j2))  is  a  semi-simple  MV-algebra. 

Proof,  Wes  verify  the  axioms  of  an  MV-algebra.  „  <R\R,<&,  (0\1)>  is  a 
commutative  semi-group  with  identity  (0 1 1).  Indeed,  the  commutativity  follows  from  the 
symmetry  in  the  definition  of  ©  above;  when  c-0  and  d  =  2, 

(a\b)  V  (0|2)  V  (Oji)  =  (ajb)  . 

Similarly,  <R\R,  o,  (2|2)>  is  a  commutative  semi-group  with  identity  (2|2).  Next,  the 
operation  ’  is  taken  to  be  ',  and  we  have 

(a|b)  ©  0 a\b Y  =  (a\b)  ©  (a'  |b)  =  (a\b)  V  {a'  |b)  V  (b'  1 2)  =  (2 |b)  V  (b'  J2)  =  (2 1 2); 

(fl\b)  o  (u'jb)  =  (a\b)(a'\b)(b\l)  =  (P\b)Q>\D  =  (0\1), 

(0\1)'  =  (1\1). 

Next,  always  assuming  that  a£b,c£d, 

((a 1 2?)  ©  (eld))'  =  (a  V  c  V  b'd'\a  V  c  V  bd  V  b'd')' 

=  (a'c'b  V  a'c'd\a  V  cV  MV  b'd') 

=  {(a'c'b  V  a'c'd)(a  V  cV  MV  b'd')|a  VcVMV  b'd') 

=  (a'c'bdja  V  c  V  bd  V  b'd') 

=  (a'|b)  o  (c'd)  =  (c|b)'  o  (c|d)'; 

((a|b)  o  (cjd))'  =  (ac | a ' 2?  V  c'd  V  bd  V  b'd') 

=  (a'  V  c' ja'b  V  c'd  V  c'd  V  bd  V  b'd') 

=  ((a'  V  c'Xa'b  V  c'd  V  bd  V  b'd')ja'b  V  c'd  V  bd  V  b'd') 

=  (a'bVc'dYb'd'|a'bVc'dVbdVb'd') 

=  (fl'|b)©(c'|d)  =  (a|b)'©(c|d)', 

noting  that  a<b  and  c<d  imply  that  a'b'  =  b'  and  c'd'  =  d'. 
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It  is  easy  that  (a\b)"  =  (a | b),  and  is  is  readily  checked  that 

(a\b)  ©  (a\b)'  o  ( c\d)  -  ( a\b )  V  (c[d) 
and 

((a\b)  ®  (c' \d))  o  (c|d)  =  {a\b)  A  (c|d) . 


Thus,  V  and  A  on  are  precisely  the  derived  operations.  Also,  <R\R,V,  (0|7)>  and 
<R\R,  A,  (i|i)>  are  commutative  semi-groups  with  identity.  Rnally,  it  can  be  checked 
that 

(fl\b)  o  [(c[d)  V  (e[ f)]  =U(a\b)  o  (c\d) ]  V  [(.a\b)  o  (e[f)] 

(a\b)  ©  ((eld)  A  (e [/))  =  [(a|fc)  ©  (c[d);  A  [(a|6)  ©  (e(/)]. 

For  n>2, 


Thus  if 


then  for  all  n  >  0, 


and  for  a  £  b. 


(a\b)  ©  . ..  ©  (a\b)  =  (b'  V a\l) 


n  imes 


(c\d)>(b'  \la\l), 


(a\b)  ©  ...  ©  (a\b)<(c\d)  , 


n  times 


( a\b )  o  ( b '  Va\l)  =  (a\b). 

Hence 

{a\b)o{c\S)  =  {a\b), 


that  is,  R  j  R  is  archimedean.  Indeed, 

(a\b)  o  (c|d)  =  (a\b)(c\d){b  V  d[/) 
=  (a\b)(bVd\l)  =  (a\b), 


since  (a\b)  <  (b'  V  a\l)<  (c|d),  using  the  criterion  that  since  (a\b)  <  (c\d)  if  and  only 
if  ab<,cd  and  c'd^a'b.  o 
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Remarks.  There  are  some  similarities  with  fuzzy  logic. 

(i)  The  two  additional  operations  ©  and  o  on  R\R  are  defined  in  terms  of  the 
original  logical  operations  V,  A.  The  algebraic  structure 

CR|K,®,o,(.)',<0|i),(/|7),£) 

is  somewhat  similar  to  a  quantum  logic,  since  with  respect  to  ©  and  o  ,  the  operator  ' 
is  an  ortho-complementation,  so  that  the  law  of  excluded  middle  holds,  and  o  is  not 
distributive  over  ©.  However,  o  is  not  idempotent 

(ii)  In  fuzzy  logic  (for  example,  Zadeh,  1983),  the  basic  connectives  are  defined  in 

terms  of  operations  on  the  unit  interval  [0, 1]  :  V  =  max,  A  =  min,  '  =  1  -  .  As  in 
Belluce  (1986),  [0, 1 ]  becomes  an  MV-algebra  when  one  introduces  new  operations 

© ,  o  ,  and  ’  given  by 


xey  =  1  A(x  +  y), 


xo  y  =  0  V  (x  +  y  - 1). 


for  x  ,y  e  [0, 1].  In  turn,  A  and  V  are  expressed  in  terms  of  ©  and  o  by 


and 


x  Ay  =  (xey)  oy, 


xV  y  =  xqx  o  y. 


(iii)  For  u  =  112,  Lukasiewicz's  thiee-valued  logic  is  a  subalgebra  of  the 
MV-algebra  [0,  2],  that  is,  is  a  "bold”  algebra  of  fuzzy  sets.  An  alternative  proof  of 
Theorem  1  is  obtained  by  using  Theorem  2  of  Section  3.4,  and  making  the  easy 
verification  of  the  above  fact 

Let  A  =  {0,  2/2, 1).  Define,  for  x,y  e  A, 


xey  =  min(l,  x  +  y), 
x  o  y  =  max(0,  x  +  y-1), 
x  =  1  -  x. 


Then 
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x  V  y  =  max(pc,  y), 
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CHAPTER  5 


CONDITIONAL  EVENTS  AND  PROBABILITY 


The  connection  between  logic  and  probability  is  apparent  in  automated  reasoning 
processes  under  uncertainty.  A  systematic  study  of  the  extension  of  probability  logic  to 
the  conditional  case  will  be  presented  in  Chapter  6.  In  this  Chapter  5,  we  establish 
various  basic  properties  of  probability  measures  extended  to  the  algebra  of  conditional 
events  as  well  as  the  justification  of  assigning  conditional  probabilities  to  conditional 
events.  We  discuss  the  association  of  randomness  to  conditional  events  (such  as  random 
sets,  random  conditional  events,  random  conditional  variables).  Finally,  a  general  concept 
of  qualitative  (or  measure-free)  conditional  independence  is  introduced. 

5.1  Uncertainty  measures  on  conditionals 

It  is  an  accepted  thesis  that  uncertainty  is  essentially  conditional,  that  is,  the 
uncertainty  of  an  event  is  always  conditioned  upon  some  other  events.  At  the  numerical 
level,  that  is,  when  uncertainty  is  taken  in  a  quantitative  way,  a  natural  domain  for 
uncertainty  measures  is  a  conditional  space  R|R.  For  example,  in  order  to  rigorize 
Lindley’s  discussions  on  the  inadmissability  of  uncertainty  measures  in  expert  systems,  via 
the  scoring  rule  approach  (Lindley,  1982),  it  is  necessary  to  evoke  conditional  events 
(Goodman,  Nguyen  and  Rogers,  1990). 

By  an  uncertainty  measure  \l  on  R|R,  we  mean  a  map  ji:  R\R  -*U,  say,  where  IR 
denotes  the  set  of  real  numbers.  Now,  for  (a|b)  e  R\R,  we  have  (a\b)  =  [ab,  b'  V  fl],  an 
interval  in  R  (see  Section  2.3).  Thus,  an  uncertainty  measure  on  /?(/?  can  be 
derived  from  a  map  v:R-*  K  as  follows. 

lL(a\b)  =  F(y{ab),  v(b'  V  a)) , 

where 

F :  RxR-»IR 


is  some  given  function.  For  example,  if  v  =  P,  a  probability  measure  on  R,  and 

F(x,y)  = - £ - , 

1  +  x  -  y 

we  have  ji(a | b)  =  P(a\b),  provided  P(b)  >  0.  (See  also  Dubois  and  Prade,  1991.)  It  is 
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obvious  that,  in  this  way,  various  uncertainty  measures  on  R\R  can  be  considered.  In  his 
book,  however,  we  are  concerned  only  with  uncertainty  measures  derived  from  probability 
measures.  For  other  types  of  uncertainty  measures,  see  Dubois  and  Prade,  (1991). 
Specifically,  we  will  proceed  to  justify  conditional  probability  as  a  means  to  assign 
uncertainty  to  conditional  events.  Note  that,  in  the  development  of  the  theory  of 
measure-free  conditioning  (Chapter  2),  the  condition  of  compatibility  with  probability  was 
used  in  an  essential  way,  so  that  it  is  possible  to  assign  conditional  probabilities  to 
conditional  events  in  a  consistent  manner.  We  emphasize,  however,  the  appearance  of 
conditional  events  as  cosets  of  Boolean  rings  in  our  present  work.  The  problem  has  been: 
define  mathematically  objects  (a|b)  representing  implicative  propositions  of  the  form  "if 
b,  then  a  "  (implicit  or  explicit)  or  "  a  on  condition  b  "  or  "a  given  b"  in  such  a  way 
that  it  is  possible  to  quantify  the  strengths  of  these  propositions  by  conditional 
probabilities.  Of  course,  if  (ajb)  is  modeled  as  material  implication  b-*a  =  b'Va  , 
then  one  can  quantify  it  by  unconditional  probability  P(b  -♦  a)  .  The  general  problem  in 
reasoning  under  uncertainty  in  artificial  intelligence  is  this.  Given  a  knowledge  base 
consisting  of  uncertain  conditional  information,  how  does  one  combine  these  conditional 
propositions  and  do  inference?  At  the  syntax  level,  one  first  needs  to  define  or  model  "if 
b,  then  a"  by  b  3  a  ,  say.  Next,  define  appropriate  connectives  among  such  objects  so  that 
one  can  combine  b  4  a  with  d  =>  c  through  the  use  of  these  connectives.  For  example, 
(b  =>  a)  A  (d  =*  c)  .  At  the  numerical  quantification  level,  one  chooses  an  uncertainty 
measure  p  which  can  operate  on  the  (b  3  a)  and  proceeds  to  compute,  for  example, 
/i((b  ^  a)  A  (d  4  c))  .  When  pQ)  =>  a)  is  chosen  to  be  P(a|b) ,  then  b  4  a  has  to  be  a 
coset.  The  logical  operations  among  cosets  developed  in  Chapter  3  provide  connectives 
for  conditional  propositions.  One  combines  severeal  conditional  propositions  at  the  syntax 
level,  obtaining  another  coset,  and  then  evaluates  its  conditional  probability  which  is 
considered  as  the  measure  of  uncertainty  of  the  combined  evidence.  Furthermore,  it  will 
be  shown  in  Chapter  6  that  an  entailment  relation  among  conditional  propositions  can  be 
established  so  that  deduction  or  inference  can  be  carried  out  at  the  numerical  level.  If  b  3 
a  is  modeled  differently,  for  example,  as  in  a  "first-order  conditional  logic"  of  Delgrande 
(1987),  then  the  quantification  measure  p  should  be  different  than  a  conditional 
probability  operator.  As  an  example,  one  can  model  b  3  a  as  b  -»  a  (material 
implication)  and  use  some  appropriate  non-additive  "measure"  p  on  the  ring  R  so  that 
p(b  -*  a)  =  /i( ajb)  ,  where  /i( - 1 b)  is  defined  to  be  a  "conditional  measure".  A  typical 
situation  is  when  p  is  chosen  to  be  a  Dempster-Shafer  belief  function  (see  Sombe,  1990, 
p.  405-406,  or  Pearl,  1990,  p.  371-373).  For  example,  let  R  be  the  power  set  of  a  finite 
set  D  .  Let  m  :  R  -*  [0,  1]  be  such  that 

m($)  =  0 ,  Z  m(a)  =  1 
a<D 
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and 


For  fixed  b  ,  define 


u:R-i  [0,  1]  with  |i(b)  =  £  m(a) . 

a<b 


mb(a)  =  l  m(x) 


where  the  summation  is  over  all  x  in  R  such  that  bx  =  a  .  Then  the  conditional  belief 
function  /i(*|b)  is  defined  by 


It  is  easy  to  verify  that 


M(a  1  b)  =X  m,(x). 
x^a  0 


{x  :  x  £  b'  V  a  }  =  {  x  :  xb  =  y  £  a}  . 


Thus,  /t(a  |b)  =  jx(b  -»  a) . 


It  is  relevant  here  to  describe  the  works  of  Renyi  (1970)  and  Cox  (1961).  First,  let 
(Q,  be  a  measureable  space.  If  P  is  a  probability  measure  on  then  the 

A 

associated  conditional  probability  "operator"  P  is  defined  as  follows.  Let 

Wp  =  [a :  a  e  jC ,  P(a)  >  0}. 

Then  define 

P  :  t/£x  Wp  -*  [0, 1 ) 

by 

P(a,  b)  =  P{a\b)  =  P(ab)IP(b) . 

A 

Here,  P  is  viewed  as  a  "global"  map,  that  is,  with  domain  jiv.  Wp  ,  rather  than  "locally," 
that  is,  rather  than  a  collection  of  maps  ?(■  |  b),  one  for  each  b  e  .  This  is  in  line 
with  Rdnyi's  concept  of  conditional  probability  spaces  (Renyi,  1970).  See  later  for  details. 

A 

The  map  P  has  the  following  basic  properties: 

A 

(i)  For  each  b  e  Wp  ,  P{-,  b) :  [0, 1]  is  a  non-negative  and  cr- additive  set 

function  (that  is,  a  measure). 

(ii)  For  every  bzwp,  Pip,  b)  =  1. 

A 

(iii)  For  b,  c  e  Wp  with  b  £  c ,  one  has  P(b,  c)  >  0 ,  and  if  ae  then 

P(a,  b)  =  hob,  c)lhb,  c) . 

The  subset  Wp  of  has  the  following  basic  properties: 


142 


Conditional  events  and  probability 


(iv)  If  bp  Z?2  6  wp  >  then  bj  V  e  wp  - 

+«» 

(v)  There  exists  a  sequence  b  e  wp ,  n  £  2,  such  that  V  &  =12. 

n=2 

(vi)  0  e  Wp  . 

Following  R€nyi,  a  subset  5c  *4,  satisfying  (iv),  (v)  and(vi),  is  called  a  bunch. 

The  abstraction  of  the  above  is  clean  an  abstract  conditional  probability  operator 
(or  conditional  probability  .operator  or  CPO  for  short)  on  (Q,  <4,  5),  where  5c  *4  is 

A 

a  bunch,  is  a  map  P  defined  on  t4x  5,  satisfying  (i),  (ii)  and  (iii). 

Note  that,  from  (iii)  and  (ii),  with  b  =  c,  we  get 

P(a,b)  =  P(ab,b). 

A 

By  (i)  *  F(-,  5)  is  non-decreasing,  so  that 

P(a,  b)  =  P(ab,  b)  <  P(b,  b)  =  1. 

A  A  A 

Also,  P(0,  b)  =  0 ,  since  P(-,  &)  is  a  measure  by  (i).  Thus,  the  range  of  P  is  [0,  2). 

The  main  result  of  Renyi  (Renyi,  1970,  p.  40)  is  this.  If  P  is  a  CPO  on 
(Q,  *4,  5),  then  there  exists  a  tr-finite  measure  ft  on  .4 ,  unique  up  to  a  positive 
constant  factor,  such  that: 

5  c  [a :  a  e  ,4, 0  <  /x(a)  <  +»}, 
and  for  all  a  6  ,4  and  b  e  5 , 

P(a,  h)  =  p{ab)!p(b). 

A  A 

For  j±  to  characterize  P,  we  need  to  extend  P  so  that 

5=  {a  :  a  e  .4, 0  <  p(a)  <  -f  »}. 

This  can  be  done  as  follows  (Rdnyi,  1970,  p.  43).  Clearly, 

5*  =  [a :  a  6  *4,0  <  p(a)  <  -f~) 

is  a  bunch.  Note  that  5*  is  the  same  for  all  measures  p  in  Rcnyi's  theorem  above. 
We  have  5c  5*.  If  5^  5* ,  we  extend  P  to  *4k.  5*  by 

P(a,  o)  =  p{ab)Ip(b) , 

for  b  €  5*  -  5  and  ss  This  extended  operator  P  is  a  CPO  on  (H,  .4  *2*)- 
Tncrefore,  there  is  no  loss  of  generality  to  assume  that  any  CPO  P  on  (fi,  *4  5)  is 
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characterized  by  y  with  3  =  {a:  ae  ,0  <  y(a)  <  +«)  .  Thus,  each  CPO  P  on 
(£2,  jt,  3)  is  derived  from  some  (^finite)  measure  y  on  (£2,  ^  .  In  particular,  if  y 
is  finite,  that  is,  fi(Q)  <  +~  ,  then  £2  e  3,  so  that  ?(•,  £2)  is  an  ordinary  probability 
measure  on  (£2,  ) ,  where  for  a  e  jt ,  P(a,  £2)  is  interpreted  as  tht  ^.conditional 

probability  of  the.  event  a  . 

As  a  final  note  on  Rdnyi's  work,  recall  that  Rdnyi's.  concept  of  conditional 

A 

v  probability  spaces  (£2,  ji,  3  P)  was  motivated  by  the  thesis  mentioned  at  the  beginning 
of  this  section  that  "every  probability  is  in  reality  a  conditional  probability."  Thus,  it  is 
intuitive  to  define  CPO  first,  and  then  derive  ordinary  probability  measures  as  special 
cases.  Conditional  probability  spaces  are  consistent  with  Kolmogorov's  model  of 
probability  spaces  in  the  sense  that  they  generalize  Kolmogorov's  probability  spaces. 
Note,  however,  that  Kolmogorov  defined  probability  measures  first  and  then  derived 
conditional  probability  measures. 

Next,  we  outline  Cox's  work  (Cox,  1961)  concerning  a  class  of  uncertainty  measures 
which  can  be  transformed  into  conditional  probability  measures.  In  passing,  we  will 
mention  the  analogy  with  Lindley's  message  on  the  inevitability  of  probability  (Lindley, 
1982). 

Let  I?  be  a  Boolean  ring  of  propositions.  Taking  the  same  thesis  that  numerical 
uncertainty  is  conditional  in  nature,  Cox  proceeded  to  derive  a  calculus  of  uncertainty  as 
follows. 

Let  y  be  a  map  on  an  appropriate  domain  in  R  x  R.  Cox(1961,  pp.  18-22)  proved 
that  if 

(1)  y{a,  b )  ~fiy{a',  b ))  with  /  differentiable,  and 

(2)  y{ab,  c)  =  y/a,  c)y(b,  ac ) , 

then  y(.-,b)  is  finitely  additive  and  f{x)  =  1  -x. 

More  generally,  Cox  replaced  (2)  by 

(3)  n(ab,  c)  =  FQi(a,  c ) ,  y(b,  ac))  with  F{x,  y)  differentiable. 

Then  he  showed  that  there  exist  functions  g  of  one  variable  such  that  (1)  and  (2)  are 
satisfied  when  y  is  replaced  by  goy.  As  a  consequence,  goy(-,  b)  is  finitely  additive. 
In  other  words,  the  uncertainty  measures  y  satisfying  (1)  and  (3)  can  be  transformed  into 
(conditional)  probabilities.  Cox  argued  that  there  is  no  difference  between  y  and  goy 
since  "if  y(a\b)  measures  probability,  so  also  does  an  arbitrary  function  of  y{a\b)" 
(Cox,  1961,  p.  16).  This  is  precisely  what  we  should  understand  years  later  when  Lindley 
declared  that  "one  cannot  avoid  probability"  (Lindley,  1982). 

Now,  suppose  P  is  a  probability  measure  on  (£2,  ji),  and  let  g(x)  =  xr  for  some 
r>  1.  Then  goP  =  Pr  is  no  longer  a  probability  measure.  In  fact,  Pr  is  a  belief  function 
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in  the  sense  of  Dempster-Shafer  (Shafer,  1976).  See  also  Section  5.3.  Of  course,  Pr  can 
be  computed  from  P  ,  but  as  a  set  function  on  Pr  satisfies  a  weaker  set  of  axioms 
than  that  of  a  probability  measure.  In  an  abstract  setting,  that  is,  when  belief  functions  are 
defined  from  axioms,  not  all  belief  functions  are  functions  of  probability  measures  (see 
Goodman,  Nguyen  and  Rogers,  1990).  In  Lindley’s  sense,  belief  functions  which  cannot 
be  transformed  into  probability  measures  are  "inadmissible."  More  generally,  if 
p :  <st-<  !R+,  say,  is  a  set-function  representing  a  quantification  of  uncertainty,  then  p  is 
"admissible"  if  there  is  some  function  g  such  that  gop  is  a  (finitely)  additive 
set-function.  It  is  clear  that  p  need  not  he  a  probability  measure.  Thus,  in  the  view  of 
Lindley,  whenever  an  uncertainty  measure  p  is  considered,  one  should  find  some 
function  g  such  that  gop  is  a  probability,  and  then  inferences  should  be  based  upon 
gop  and  not  upon  p  .  As  we  have  seen,  a  sufficient  condition  for  the  existence  of  such 
g  is  the  set  of  conditions  (i)  and  (iii)  in  Cox's  program.  Note  that  the  work  of  Lindley  is 
"conditional"  in  nature.  Any  p  which  cannot  be  transformed  into  probabilities  should  be 
ruled  out!  Because  of  this  important  view  on  decision  making  in  uncertain  systems,  we 
present  below  an  outline  of  Lindley’s  paper.  For  more  details,  see  Goodman,  Nguyen  and 
Rogers  (1990). 

Let  R  be  a  Boolean  ring,  viewed  as  a  field  of  subsets  of  some  set  Cl.  Roughly 
speaking,  an  uncertainty  measure  p  :  (i?|i?)  -+  IR  is  said  to  be  "admissible"  if  there  is  a 
function  g  such  that  gap  is  finitely  additive.  To  make  this  statement  precise,  we  need 
to  explain  the  concept  of  admissibility  and  the  sense  in  which  gop  is  finitely  additive. 
The  most  general  framework  in  which  admissibility  can  be  addressed  is  game  theory. 

Consider  the  following  special  class  of  games  called  uncertainty  games.  These  are 
j-Iples  (A j,  A^  L )  of  the  following  form.  A j  is  regarded  as  a  space 

A  j  =  {(((.aj\bj), ...,  (fln\bj),  co) :  a[}  b.eR,i=  1, ...,  n;  coe  Cl  ,n>l) 

of  all  possible  "moves"  or  "pure  strategies"  of  player  I.  Fix,  once  and  for  all,  two  real 
numbers  a.Q<  ,  and  let 

A2=  {/i:(R|rt)-*[«o,  cti]}  . 

Each  element  of  A2  is  a  map  assigning  a  number  (describing  the  uncertainty)  to  each 
conditional  event.  A2  is  regarded  as  the  space  of  "moves"  of  player  II.  Consider  now  the 
choice  of  loss  function  L.  As  in  Lindley's  paper,  a  function 

/:  [oq,  ail  x  [0,u,  1}  -» R 


is  called  score  function  if 
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(i)  for  each  j  s  {0,  2 },  /(•,/) ;  [ttg  ,  oq]  — » IR  is  continuously  differentiable,  with  a 
unique  global  minimum  in  [cty  ,  oq]  at  cty ,  and 

(ii)  fix,  u)  =  0  for  all  x  €  [ccq  ,  oq] . 

We  extend  /  to  [ab,  oq]n  x  [0,  u,  l)n ,  n  £  2,  as  usual.  For 

A  „ 

Xn  =  (Xj,  ....  Xn)  6  [CCq,  Oqf, 

U  =  (tr  y e  {0‘  ^ 

*/P  =  '„))  6  ->yn)  e  IR*  ,  n  St  1) . 

Similarly,^  is  extended  to  (ftji?)71  componentwise.  For 

(£l ?)n  =  «Pj\ bp. ....  (an\bn))  e  (R\R)n, 

Mz\b)n  =  (dlajlbj), ...,  H(an\bn))  e  [Oq  ,  oqf  . 


Let  <p(a|&)  denote  the  generalized  indicator  function  of  {a  |  b).  A  natural  way  to 
combine  individual  "scores" 

fWfljty  ,  (pla^bficd)) ,  i  =  1, 2, ....  n, 

to  obtain  the  total  score  is  using  addition  on  R.  That  is,  take 

n 

Lft+«Z\V)n>  “-M)  =  l  Ma .\b.) ,  (pla^b filed) . 

i=l 

The  loss  function  lm  depends  on  two  functions,  the  score  function  /  and  the  additive 
aggregation  function  + . 

In  general,  by  an  aggregation  function,  we  mean  a  function 

Y‘  Uyj.  -> yn)  e  if,  n>i}-»[R 

such  that 

a)  y  is  continuous  differentiable  in  all  of  its  arguments, 

b)  y  is  increasing  in  each  of  its  arguments,  and 

c)  YlO„)  =  0  ,Vn>l ,  where  0^  denotes  the  zero  vector  in  (Rn 

The  additive  aggregation  function  is  generated  by  ordinaiy  addition  on  0 Taking  y=  + 
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is  equivalent  to  the  sequence  of  functions  ,  n  £  i,  where 

n 

*„ :  -  *•  *«(*; . V  =  I  xi- 

i=l 

Similarly,  an  aggregation  function  y  can  be  identified  with' the  sequence  yn,  n^l, 

where  yn  is  the  restriction  of  y  to  R”.  While  the  additive  aggregation  function  is 
symmetric,  there  is  no  a  priori  reason  to  impose  such  a  condition  on  arbitrary  aggregation 
functions. 

The  game  (Aj,  Ly+)  will  be  denoted  by  G^+.  It  is  simpler  to  formulate  the 
concept  of  admissibility  of  uncertainty  measures  using  an  equivalent  reduced  form  of 
Gfi+-  In  the  expression  of  V  the  value  of  L^+((a\B)n  ,  co,  ji),  for  each  fixed  ft,  at 

((a\b)n  ,  co)  e  A j,  depends  on  the  "configuration" 

<p(a \b)n(co)  =  ((piajlbjXco) , ....  <p(an\bJ(cQ))  e  {0,  u,  l)n. 

Thus,  Lf+((a|6)n,  -,p)  is  constant  on  each  element  of  the  canonical  partition  ida\b)n 
of  Q.  generated  by  (a|&)^ .  Specifically, 

rta\b)n  =  {Bj,j  =  1,2,  ...,23n), 

3n  e, 

where  each  B.  is  of  the  form  AD,,  for 
;  k=l  K 

Dk  6  {afa  ,  a'ibi ,  b\  ,i-l, ...,  n)  , 

£k  =  1  or  0  ,  =  a  ,  cP  =  a'. 

(See  Renyi,  1970,  p.  12-15.)  Thus,  we  can  replace  Aj  by 

A;  =  i(a\b)n>  B) :  (a\b)n  e  (R \R)n,  B  e  n(a\b)n  ,n>l). 

Lf  ,  is  modified  to 
/»+ 

Lf,+  :  A ;  X  A2  -4  !R, 
n 

L*f^i\bln  • B’  M  =  I  .  (OjlijKB)) 

i=i 

where  (a{.|bp(B)  =  >ptaJ-|bp(©)  for  coeS.  The  equivalent  reduced  form  of  Gy+  is 
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Gf,+  ~  (Ai'  A2'  L/,+)- 

™  •  * 
The  development  above,  as  well  as  various  concepts  of  admissibility  with  respect  to  Gr 

J 

which  will  be  formulated  below,  are  extended  in  a  straightforward  manner  to  an 

arbitrary  aggregation  function  yr ,  replacing  +  by  yt . 

First,  ixz  A>2  is  ( ordinary )  admissible  with  respect  to  Gy  +  if  there  is  no  v  e  A2 
such  that 

£/,+«£ |6)„  ,B,V)£  Lf  +((a\b)n , B,  ix) 

* 

for  all  ((a\b)n  ,  B )  6  A  j,  with  strict  inequality  holding  for  at  least  one  ((fl|&)n  ,  B). 

For  each  fixed  (a\b)n  >  a  subgame  of  Gy  +  is  Gy  +(a\b)n  where  is  replaced 

by 

A^l b)n  =  {li :  {((^|^) :  i  =  1 . n)  -  K}. 

* 

With  respect  to  Gy  +(a\b)n  ,  fie  a  is  admissible  if  there  is  no  ve  | 

such  that 

Lf>MbJn  ’B>v^  L/.+((-l-)n  ’ B> 

* 

for  all  B  e  idfl\b)n  >  with  strict  inequality  holding  for  at  least  one  B  .  /i  e  A2  is  said  to 

* 

be  uniformly  admissible  with  respect  to  Gy  +  if  it  is  admissible  with  respect  to 

Gf  {a\b)  for  all  (a|£>)  €  (R\R)  .  It  is  clear  that  uniform  admissibility  implies 

ordinary  admissibility.  It  turns  out  that  under  mild  conditions,  uniform  admissibility  of  fi 
* 

with  respect  to  Gf  is  equivalent  to  the  existence  of  a  function  g  such  that  the 
restriction  of  g o/j.  to  R  is  a  finitely  additive  probability  measure.  (R  is  considered  as  a 
subset  of  R  | R,  by  identifying  (a|Q)  with  a.) 

As  in  Lindley  (1982),  let  f  (x,J),  j  =  0,1,  denote  the  derivative  of  f(x,  j)  with 
respect  to  x  ;  the  above  function  g  is 

P Ax)  = - - ,xe[a 0,  aj. 

7  f(xJ0)-f'(x,l) 

The  following  result  was  proved  in  Goodman,  Nguyen  and  Rogers  (1990):  With  respect  to 
* 

the  game  Gy  +  with  score  function  /  such  that  Py  is  increasing,  ji  is  uniformly 

admissible  if  and  only  if  the  restriction  of  PjO^i  to  R  is  a  finitely  additive  probability 
measure.  But  if  /  is  not  a  proper  score  function,  that  is  if  P^x)  *  x  for  some  x,  then  it 
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* 

can  be  shown  that  no  non-atomic  probability  fi  on  R  can  be  Gy.  +-uniformly 

admissible,  so  that  we  have  to  consider  the  concept  of  admissibility  in  a  wide  sense. 

£ 

Specifically,  p  is  said  to  be  generally  admissible  if  there  is  a  game  Gf  such  that  p  is 

uniformly  admissible  with  respect  to  that  game.  In  this  sense,  any  probability  measure  is 
"admissible"  by  taking  the  score  function  /  to  be  a  proper  score  function,  that  is  by 
taking  /  such  that  Ppc)  -  x,  for  all  x  . 

Consider  Dempster-Shafer  belief  functions  (Shafer,  1976).  For  simplicity,  consider 
the  case  where  Cl  is  a  finite  set  (see  Section  5.3  for  the  general  case).  A  belief  function 
Bel  on  the  power  set  of  £2,  denoted  as  &(&),  can  be  defined  as  follows. 

Let  m  :  &(Cl)  -» [0, 1]  be  such  that 

m(0)  =  0  ,  ^  w(a)  =  1 ; 

ae${Cl) 

Bel(b)  =  J  m( a). 
a<b 

Note  that  if  p  :  &(Cl)  -» [0, 1]  is  such  that  p(Q)  =  1  and  for  all  a  <,  Q, 

b<a 

then  p  is  a  belief  function,  (jbj  denotes  the  cardinality  of  the  set.b) 

If  we  think  of  "sets"  as  "points",  then  m  plays  the  role  of  a  probability  mass 
function,  and  Bel  is  the  "cumulative  distribution  function"  of  some  random  set.  See 
Section  5.3  .  Since  Q.  is  finite,  we  have 

Bel(a)  =  P(X  e  $>{a)),  a*  Cl, 

where  X  is  a  random  set ,  defined  on  some  probability  space  ( E ,  $  P),  and  taking  values 
in  &(Q.)  with  "density"  m,  that  is 


P(X  =  a)  =  m(a). 


Note  that  Bel(a )  +  Bel(a')  <,  1. 

We  extend  Bel  from  to  the  conditional  space  &(Q.)  j  9{Cl)  as  follows. 

For  a,  be  ?(Q),  such  that  P(X  <b)>  0,  define  Bel(a\b)  =  P(X  <  a\X  <,  b).  By  the 
nature  of  belief  functions,  we  take  [Oq,  ctj]  =  [0,  /]. 

It  is  easy  to  construct  Bel  such  that  there  exist  a,  be  $>(£1)  with  Bel(a )  =  Bel(b) 
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but  Bel(a')  #  Bel(b').  It  can  be  shown  that  such  a  belief  function  is  inadmissible.  This 
is  due  essentially  to  the  fact  that,  for  such  belief  functions,  Bel(ja'  \b)  is  not  a  function  of 
Bel(a\b). 

Now  observe  that 


BeKjac\b)  =  P(XZac\XZb) 

=  P(X£a,XZc\X<b) 

=  P(XZa\XZb)PQ[Zc\X£a,XZb) 
=  Bel(a\b)Bel{c\ab). 


Thus,  there  is  a  differentiable  function  h  :  [ 0 , 1]  -+  [0, 1 ]  such  that  for  all  a  ,b<  Cl  with 
Bel(b)  >  0  ,  we  have 


Bel(a' \b)  =  h(Bel(a\b)) . 

Then  by  Cox's  result,  Bel  is  admissible.  For  example,  if  Bel  =  Fr  where  r  >  2,  and  P 
is  a  probability  measure,  then  Bel  is  admissible. 

As  another  example  of  non-additive  uncertainty  measures  which  are  admissible,  we 
turn  to  fuzzy  logics.  For  background,  see  Chapter  7.  A  t-conorm  T  is  said  to  be 
archimedean  if  T  is  continuous  and  Vx  £  ( 0 , 1),  x  <  T(x,  x).  (See,  for  example, 
Schweizer  and  Sklar,  1983).  T(x,  y)  =  min(x  +  y  ,  1)  is  archimedean,  while 
T(x,  y)  =  max(x  ,  y)  is  not  T  is  an  archimedean  t-conorm  if  and  only  if  there  exists  an 
increasing,  continuous  function  g  (called  the  additive  generator  or  generator  of  T)  which 
maps  [0, 1 ]  -•  [ 0 ,  +»)  with  g{0)  =  0  and  such  that  for  x,  y  6  [0, 1], 

T(x,  y)  =  g*(g(x)  +  g(y)). 


The  pseudo-inverse  g*  of  g  is  a  function  g* :  [0,  +«]  -*  [0, 1]  defined  by 


gUx)  if  x  £  [0,  g(l)J 
1  ifx>g(l) 


(See  Ling,  1965). 

For  example,  for  p  >  1,  T^(x,  y)  =  [min(yP  +  yP,  l)]^,p  has  generator  g^(x)  =  iP 
and 


sp«  = 


if  xe  [0, 1] 
if  x>  1 
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(noting  that  gp(l)  =  1  here). 

Since  each  f-conorm  T  is  associative  and  commutative,  we  can  ex'end  T  to  any 
n-tuples,  n  >  1,  as  follows.  T(Xj)  =  Xp  by  convention,  and  for  n>2, 

Tfrj,  Xp  •••>  Xft  —  T(jCj,  T(x2,  x^, ...,  x^) ). 

The  representation  of  an  archimedean  r-coriorm  T  becomes 


n 

T(xp Xp  ...,  *n)  =  Z  (  \  Sty ) ,  n^l. 

i=l 


For  a  be  a  r-conorm  T ,  a  T-possibility  measure  is  a  map  p  from  ^(£2)  to  [0, 1] 
such  that  for  a  ,  b£Q.  with  ab  =  Q  ,  p(a  V  b)  =  T(p(a)  , p(b)).  Zadeh’s  possibility 
measure  corresponds  to  T(x,  y)  =  maxix  ,  y).  The  following  result  is  from  Goodman, 
Nguyen  and  Rogers  (1990). 

Let  Q.  be  finite  and  p :  -*  [0,  2).  Then  p  is  admissible  if  and  only  if  p  is 

T-possibility  measure  with  T  being  an  archimedean  t-conoim  with  generator  g  such  that 

gQ)  =  l  and  £s0l«<»}))</. 

£2 

A  T-possibility  measure  with  T(x,  y )  =  maxQc ,  y)  is  not  admissible,  but  it  can  be 
approximated  by  admissible  ones.  Indeed,  if  p  is  such  that  ^  p(co)  <  1,  then  for  p  >  1, 

a 

Vp(c i)  =  Tp(p(co),  co  e  a)  is  admissible  since 


lgp(vPm=  l  wo>)f<i. 

£2  Q 


On  the  other  hand,  for  each  fixed  n. 


Tp(Xp  Xp  ...,  xn)  -*  max(xp  Xp  ....  x^ 
a sp^,  uniformly  in  ( Xp  Xp  ...» xj  . 

Thus,  if  p  :  -» [ 0 , 1]  is  defined  by 


p(a)  =  max  p{co), 
ox  a 

for  a  <  Q.,  then  p  is  a  T-possibility  measure  with  T(x,  y)  =  max(x  ,  y),  and 
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li(a)  =  l  im  via) , 

p-H-M  ** 

uniformly  in  a . 

Finally,  note  that  while  the  concept  of  admissibility  of  uncertainty  measures  with 
respect  to  a  game  is  general,  its  equivalent  form,  namely  that  admissible  uncertainty 
measures  have  to  be  transforms  of  finitely  additive  probability  measures,  is  valid  only  in 
games  with  additive  aggregation  functions.  Specifically,  there  are  non-additive 
aggregation  functions  y  such  that  with  respect  to  games  G/»V’  admissible  uncertainty 

measures  need  not  be  transformable  to  finitely  additive  probability  measures.  (For  this 
analysis,  see  again  Goodman,  Nguyen  and  Rogers,  1990). 

We  turn  now  to  the  justification  of  assigning  conditional  probabilities  to  conditional 
events.  From  the  standard  viewpoint  of  conditional  probabilities,  not  via  conditional 
events,  the  assignment  of  P(a\b)  -  P(ab)/P(b )  to  the  conditional  event  (a\b)  can  be 
justified  through  a  functional  equation  approach  of  Aczel  (1966,  p.  319-324).  See  also  the 
discussions  concerning  Cox  and  Rinyi’s  works  presented  earlier  in  this  Section. 

We  present  another  justification  based  upon  conditional  event  considerations 
(Goodman,  1991).  Let  P  be  a  probability  measure  on  a  Boolean  ring  R.  If  AcR,  then 
P(A)  is  the  image  of  A  under  P,  that  is 

P(A)  =  {P(c) :  a  e  A}. 

In  particular,  for  (a\b)  e  R |P,  we  have  (a\b)  c R,  so  that,  formally, 

P(a\b)={P(x):xe(a\b)). 

But  (a\b)  =  [ab,  b'  V  a]  ,  a  closed  interval  in  R  (with  the  partial  order  relation  <  on 
R).  Thus, 


P(a\b)  =  (P(x)  :  ab  <x  <,  b'  V  a)  c  [P(ab),  P(b'  V  a)], 
a  closed  interval  in  the  unit  interval  [0, 7).  If  P(ab)  =  P(b'  V  a),  then  P(b)  =  1  and 

P(a\b)  =  [P(ab))  =  P(ab )  =  P(ab)/P(b) . 

If  P(b'  V  a)  -  P(ab)  =  1  ,  then  P(b)  =  0,  and  conditional  probability  P(a\b)  is  not 
defined.  Thus,  consider  the  case  (j,  z]  c  [0, 1]  with  s  <t  and  r  -  s*  1.  Let 
hj  :  [0,  7]  -» (0,  7]  be  given  by 


and  for  n  >  2, 


hj(X)  =?j+(1-  X)s 
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hn(X)  =  . 

Then,  for  A  e  [0,  J),  l  im  A  (A)  exists  and  is  equal  to 


The  proof  goes  as  follows.  Since  hj(-)  is  non-decreasing  and  Xsf  is  the  (unique)  fixed 
point  of  we  have  that  A^£  A  implies  X^-^h^X),  for  n£I  .  But  if 

X  (X),  then  h  ,  ,(X)  Z  h(X).  Thus,  the  sequence  ( hn(X),n>l )  is  decreasing  and 

bounded  &om  below  by  X,t  (and  from  above  by  A).  Hence  l  im  h(X)  =  hjX)  exists. 

Similarly,  if  X<Xst,  then  hn( X)  <,  X^  Vn  >  1,  and  hence  hn(X)  <  Afi+i( A).  Thus, 

A  <  hj(X)  <,  h2(X)  <  ...  £  Xs  t, 

and  hence  limh^X)  exists. 

In  any  case,  for  X  s  [0, 1 ],  we  have 

hjX)  =  /im  A  (A)  =  ty/im  ^./A))  =  hj(hjX)) . 

Therefore,  AJA)  =  A^  r  for  A  e  [0, 1].  D 

The  above  procedure  of  assigning  the  value  A^  {  to  the  sub-interval  [s,  t]  c  [0, 1) 

can  be  extended  to  an  arbitrary  subset  A  of  [0,  /]  by  considering  [inf{A),  sup{A )],  that 
is,  if  s  =  inf{A)  and  t  =  sup(A),  then  one  assigns  to  A  the  value  s/(l  +  s  -  r). 

Now,  back  to  the  case 

A  =  [P(x),ab£xZb'  Va}c[0,  i], 

with 

inf(A)  =  P(ab) ,  sup(A )  =  P(b'  V  a). 

It  is  natural  to  assign  to  the  conditional  event  (a|&)  the  value 

P(ab) _ 

1  +  P(ab)  -  P(b'  V  a) 

when  P(ab )  <  P(b'  V  a),  which  is  the  conditional  probability  P(ab)lP(b) . 
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5.2  Conditional  probability  evaluations 

Let  P  be  a  probability  measure  on  a  Boolean  ring  R.  Unlike  the  traditional 
approach  to  conditional  probability  measures,  namely,  viewing  a  quantity  like  P(a\b)  as 
a  probability  P^(-)  on  R,  for  each  fixed  b  e  R,  we  are  going  to  extend  P  globally  to 

the  algebra  of  conditional  events  R\R,  so  that,  if  we  denote  this  extension  of  P  as  P, 
then 

— *[0,1]. 

Note  that  for  fixed  be  R  with  P(b)  >  0,  the  probability  measure  P^(-)  on  R 
defined  by 

Pb(a)  =  P(ab)IP(b) ,  Ya  e  R, 

is  "equivalent"  to  the  probability  measure  Pd’)  on  the  Boolean  (quotient)  ring  R\Rb' 

where  Pda\b)  =  Pb(a).  Indeed,  first  ?^(-)  is  well-defined  on  R\Rb'\  next,  with 

respect  to  Boolean  operations  on  R  \Rb'  (that  is,  coset  operations),  ?^(- )  is  a  probability 

measure.  Conversely,  let  P  be  a  probability  measure  on  R\Rb'.  If  we  define 
Pb(-)  :R  ~*[0, 1]  by  Pb(a)  =  P(a\b),  then  obviously  P^(-)  is  a  probability  measure.  In 

P(a\b),  (a\b)  is  an  argument  of  the  map  ?(•)•  Note  that,  although,  the  extended  value 
P{a\b)  is  taken  to  be  P(a\b)  =  P(ab)IPQ)),  for  P(b)  >  0,  in  the  usual  sense,  care  should 
be  exercised  upon  ?(•)  as  an  extended  map.  In  particular,  with  algebraic  domain 
( R [p,  v,  (-)')>  P  is  not  a  probability  measure.  As  we  have  seen,  R |P  is  not  a  Boolean 
ring.  Moreover  P  is  not  additive.  The  situation  is  somewhat  different  from  the 
axiomatic  setting  for  quantum  probability  theory  (for  example,  Gudder,  1988)  where  the 
domain  is  a  <j-additive  class  (generalizing  the  usual  concept  of  a  onfield):  there,  a  form  of 
c^additivity  is  reasonable  to  retain.  This  is  possible  because  not  only  the  physical  reality 
supports  such  a  mathematical  modeling,  but  because  quantum  probability  measures  are  not 
derived  from  classical  probability  measures  the  way  P  is  derived  from  P. 

Obviously,  the  advantage  of  viewing  P  as  a  global  map  on  R\R  is  the  fact  that, 
when  the  uncertainty  is  handled  in  a  more  quantitative  way,  one  can  combine  conditional 
evidence  with  different  antecedents.  From  a  pure  mathematical  viewpoint,  one  can  view 
P|P  as  an  algebraic  structure  generalizing  Boolean  rings,  say,  a  Stone  algebra  which  does 
contain  an  underlying  Boolean  ring,  and  consider  maps  on  R\R  such  that  their 
restrictions  to  the  underlying  Boolean  ring  are  probability  measures.  However,  here  we 
are  simply  content  with  examining  properties  of  P  for  probabilistic  inference  purposes. 
First  of  all,  extending  the  concept  of  disjointness  of  events,  that  is,  elements  of  R,  to 
we  say  that  (a\b)  and  (c\d)  are  disjoint,  if 
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(a\b)V(c\d)=:(a\b)  +  (c\d). 


In  this  case,  we  have 

P(ia\b )  V  (cjd))  =  P((a\b)  +  (c|d))  =  P((a  +  c\bd))  =  P(a\bd)  +  P(c\bd) . 

Now 


implies 


(a\b)y/(c\d)  =  (a\b)  +  (c\d) 


abV  cdv  bd  =  bd, 


that  is  abv  cd<>  bd,  so  that  ab  £  d,  cd  <,  b,  hence,  abd  =  ab,  bed  =  cd.  Thus 

P(a\bd)  +  P(c\bd)  =  +  >  p(a\b)  +  P(c\d) . 

P(a\b)  P(b\d) 

Thus,  ?  is  not  additive  on  However, 


Theorem  1.  P  is  monotone  increasing  on  R  |i?. 


Proof.  Suppose  (a\b)  <  (c\d).  Then 

(c\d)  =  {a\b)'ac\d)  = 

( ab\b )  V  (cd\d)  =  {abS  cd\ab  V  edv  bd)  . 

Since  abv  cdV  bd<b  V  cd,  we  have 

P{c\d)  =  P(ab  V  cd)/P(ab  V  cd\  bd)>  P{ab  V  cd)!P{b  V  cd)  . 

Now, 

abV  cd  =  abV  ( ab)'cd  =  ab  +  (ab)'cd  tdb  +  b' cd 
and 

bVcd  =  b  +  b'cd, 

we  have 

P{ab  V  cd)>  P(ab)  +  Pib'cd),  P(b  V  cd)  =  P(b)  +  P(b'cd) . 


Putting  these  together  yields 
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P(c\d)  2 

P(b)  +  P(b'cd) 

It  is  easy  to  see  that  is  monotone  increasing  in  t  £  0,  so  that 

P(b)  +  t 

P(c\d)> P(ab)JP(b)  =  P(a\b) .  '  □ 


Remarks. 

(i)  Since  ab  <  (a\b)  <  (o  ->  a),  we  have 

Piab)  <  P(a  |  b)  £  P(b  -n)  =  P(b'  V  a)  . 

(ii)  It  is  easy  to  check  that 

ipc\b  V  d)  <  (a\b)-(c\d)  <  ( ac\bd ), 
and 

a  V  c\bd)  <,  (a\b)  V  (c|d)  <  (a  V  c\bd), 

so  that 

P(ac\b  V  d)<P((a\b)-(c\d))ZP(ac\bd), 
and 

P(a  V  c\bd)  < P((a\b)  V  (c|d))  ^ ?(a  V  c  Jfcd) . 


For  combining  conditional  evidence,  from  a  quantitative  viewpoint,  we  present  an 
extension  of  Fr&het’s  bounds  to  the  conditional  case.  First,  we  recall  the  unconditional 
case.  Let  P  be  a  probability  measure  on  R.  Then  for  any  a,  be  R, 

P{ab)  <.  min[P(a),  P(b)} . 

In  fact,  for  a<b. 


P(flb)  =  nun[P{a),  P(b)}  , 

so  that  nun[P{a),  P(b)}  is  the  best  possible  upper  bound  for  P(ab).  Similarly, 

P(a  V  b)  <£  min[l,  P{a)  +  P(b)}, 
and  equality  is  achieved  when  ab  =  0.  Now 


P(a'  V  o')  <  min[l,  P{a')  +  P{b’)  =  min{1, 2  -  P(a )  -  P(b) }  , 


so  that 
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P(ab)  =  l-P{a'  Vfc')£ 

1  -  mn[l,  2  -  P(a)  -  P(b)}  =  max{0,  P(a)  +  P(b)  -  2}, 

which  is  the  best  possible  lower  bound  for  P(ab ).  By  the  same  token,  the  best  possible 
lower  bound  for  P(a  V  b)  is 

2  -  mn{P{a'),  P(b')}  =  max{P(a),  P(6))- 


More  generally,  for  n>  2,  we  have 


n 


maxtO,  T  P(a.)  -  (n  -  2)}  £  P(  A  fl-)  <  mm{P(a£),  i  =  1, ....  n], 

L  l  i—J  *• 

1=2 


and 


n 


j nax[P(a,),  i  =  I, ....  n)  <  F(  V  a,)  <  )>  f(a;))  - 

i=/  i=l 

Now,  let  be  a  Boolean  function  of  n  variables.  Write  9  in  its 

normal  disjunctive  form 


llJ  2  Jn 


qXfij,  a2, ...,  a  )  =  Y  ...  V  9(1;,  t2,  c2  -  an  ’ 

ij=0,l  in=0,l 

with  the  usual  convention  a0  =  a',  a 1  =  a.  It  is  easy  to  see  that  one  can  determine  two 


functions 


such  that 


u9,L9-.io.nn<o,i\ 


L(?(P(ai),...,P(ai))  <  P[9(«i . an)}  <  U^Piafl, .... P(*n)) , 

where  L  =  2  -  22  #,  with  (p'iflj,  o^)  =  [9Cfl/»  — » - 

These  results  were  also  obtained  by  Kailperin  (1965,  1984)  using  the  technique  of 
linear  programming.  This  latter  technique  provides  a  feasible  procedure  for  computing  the 
bounds  L  and  22  of  P[9J,  and  can  be  adapted  to  computational  procedures  in 
probability  logic  (Nilsson,  1986).  To  nnd  U ^  ,  let  cc-  —  Plft-i,  i  =  2, 2, ....  n  .  Then 
from  the  normal  disjunctive  form  of  <p(Pj, ...,  afi)  ,  we  have 


Conditional  probability  evaluations 


157 


2  2 

i;=0  i=0 

where 

ftij . in)  =  P(a/al22  ...alnn).  - 


Next,  for  each  /  =  2, n, 


a.  =  V  ...  V 

J  i j^OJ  ij  2=0,1  ij+ 2=0J  in=0,l 


V  ...  V  -fl/2  fl/;+2  '"an 


so  that 


(*)  a.  -  \  \  5!  •••  I  wp z}-2'  ^  v+2*  y ; 


;ro  ij_rOij+1=o  i=o 


1 1  K 

also,  since  the  aj  ...  a^’  form  a  partition  of  2  (the  greatest  element  of  R ),  we  have 

2  2 


(**) 


I  •••  I  ^2’  y  - 1 ' 


‘r°  <n=° 


Thus  the  least  upper  bound  of  ...» o^)]  is  obtained  by  maximizing 

P[o(flj, ...,  a  )],  as  a  function  of  the  variables  /3(ij, ...» i^),  subject  to  the  constraints  (*), 
(**)  and  /3(ip  ...,  ij  >  0  (the  a-s  are  constants).  Note  that  since  the  <p(ip  ...,  f  )' s  are 
either  0  or  1  (elements  of  2?),  the  linear  constraints  (*)  and  (**)  can  be  put  in  a  matrix 
form  with  a  "design  matrix"  whose  entries  are  0‘s  and  2’s. 

The  linear  programming  technique  above  for  actually  determining  lower  and  upper 
bounds  Zy  U  of  P[©]  for  any  boolean  expression  <p  can  be  adapted  to  a  similar 
situation  in  probabil.ty  logic  (Nilsson,  1986).  Since  Chapter  6  will  deal  with  conditional 
probability  logic,  it  .s  relevant  here  to  say  a  few  words  about  basic  aspects  of  probability 
logic  (see  also,  Rescher,  1969  and  Hailperin,  1984).  We  follow  Nilsson  (1986). 

Although  tite  collection  of  sentences  of  interest  forms  a  Boolean  ring  R,  and  hence, 
oi.e  can  talk  about  probability  measures  P  on  it  in  an  abstract  setting,  in  practice,  only  a 
small  set  of  sentences  is  to  be  considered,  for  example,  evidence  in  an  expert  system.  The 
problem  of  probabilistic  entaiimenc  is  this.  Suppose  we  have  a  set  of  sentences  s-, 
i  =  2, ...,  n  with  known  probabilities  P(s-)  ,  i  =  1,  ..,  n  ;  compute  P(r),  for  some 
sentence  r  of  interest,  in  terms  of  the  P(,sj)'s.  First,  sentences  are  taken  to  be 
"propositions,"  tnat  is,  each  sentence  is  either  true  or  false  only.  However,  the  uncertainty 
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emerges  since  we  do  not  know  whether  a  given  sentence  is  true  or  false,  based  on 
available  information.  By  Stone's  representation  theorem,  one  can  view  each  sentence  s 
as  a  subset  of  a  "sample  space"  or  universe  Q.  on  which  probability  measures  are  defined. 
In  this  setting,  a  "possible  world"  is  simply  an  element  of  £1  Thus,  for  each  scQ,  there 
arc  two  sets  of  possble  worlds  s  and  s':  in  s,  s  is  true;  and  in  s',s  is  false.  One  can 
also  consider  the  indicator  function  of  s  :  ls  :  {0, 1}  ;  and  as  in  statistical  theory, 

before  performing  an  experiment,  it  is  meaningful  to  consider  the  chance  that  s  will  be 
"realized."  In  expert  systems,  for  example,  a  sentence  (evidence)  s  is  to  be  considered, 
and  it  is  desired  to  know  its  probability  of  being  true.  This  is  the  common  interpretation 
for  probabilities  of  sentences.  On  the  other  hand,  inference  mechanisms  in,  say,  expen 
systems,  require  some  form  of  logical  deduction  to  reach  decisions.  In  the  presence  of 
uncertainty  (about  the  trueness  and  falseness  of  sentences),  it  is  reasonable  to  invent  a 
multi-valued  logic  in  which  the  (probabilistic)  truth  value  of  a  sentence  s  is  taken  to  be 
its  probability  P(s)  of  being  true.  This  logic  is  termed  probability  logic.  Its  base  space 
remains  a  Boolean  ring  as  in  classical  two-valued  logic,  while  its  truth  evaluations  range 
over  the  unit  interval  [0, 1].  Its  difference  with  the  simplest  form  of  fuzzy  logic  lies  in  its 
non-truth  functional  calculus  (derived  from  axioms  of  probability  measures)  as  well  as  in 
the  interpretation  of  the  meaning  of  degrees  of  beliefs. 

Consider  a  finite  collection  of  sentences,  that  is  n  subsets  Oj, ....  an  of  Q  (or 
equivalently,  n  elements  of  a  Boolean  ring  R).  These  sets  generate  a  finite  partition  of 
i  j  i  „ 

Q,  namely  [aj  ...a n  )  where  {0,/},  j=l,..,n  (with  the  usual  convention 
a0  =  a',  a1  =  a,  as  before). 

In  logical  terms,  these  sentences  generate  m  (<  2n)  sets  of  possible  worlds.  In  each 
of  these  sets  of  possible  worlds,  one  can  specify  the  true/false  values  of  any  Boolean 
expression  of  the  variables  (that  is,  component  sentences)  using  its  normal  disjunctive 
form  as  usual.  For  example,  two  sentences  a,  b  generate  four  sets  of  possible  worlds, 
namely  ab,  ab' ,  a' b,  a' b' .  A  possible  world  is  a  state  of  nature,  or,  in  the  "sample 
space"  setting  an  element  0)  e  Q.  However,  unlike  statistics,  one  cannot  "perform  an 
experiment"  to  get  the  "outcome"  co.  Consider  three  Boolean  expressions  fj(a,  b)  =  a, 
f2(a,  b)  =  a  -*  b  =  a'  V  b,  f^(a,  b)  =  b.  A  "truth  matrix"  for  these  expressions  is  obtained 
by  specifying  their  truth  values  on  each  of  the  above  sets  of  possible  worlds  (in  the  order 
written) 
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M  = 


'  1  1 
1  0 
1  0 
ab  ab ' 


0 

1 

r 

X 


0 

1 

0 


h 

h 

h' 


a'b  a'b ' 


In  terms  of  normal  disjunctive  forms, 

a-ab'-i  ab', 


so  that 


If  we  set 


a'  V  b  =  aby  a'b  V  a'b\ 
b  =  ab\‘  a'b, 

P(a)  =  P(ab)  +  P(a'b), 

P(a'  V  b )  =■•  P(ab)  +  P(a'b)  +  P(a'b'), 
P(b)  =  P(ab)  +  P(a'b) . 


Xj  =  P(ab),  x2  =  P(,ab'),  Xj  =  Pifl'b),  x4  -  P(a'b') 
and 

r.j  =  P(c),  %2  =  Pi.a'  V  b),  n3  -  Pip), 


then  %  =  MX ,  where 


xi 

It  = 

*2 

,  x  = 

x2 

hj 

x3 

K*" 

The  equation  it  =  MX  represents  a  "consistent"  condition  for  the  assignments  x-s.  To 
include  the  condition  +  x2  +  +  x4  =  1  (besides  x-  >  0),  one  usually  adds  the  row 

(/,  1, 1, 1)  to  the  top  of  M  and  modify  n  to 


n  = 


1 

nl 

*2 

k3 


The  general  probabilistic  entailment  problem  is  this.  Given  a-  and  it-  =  P(aj), 
i  =  1, n  and  a  sentence  of  interest  b.  In  view  of  the  above  procedure  detailed  in  the 
example,  one  first  needs  to  include  b  into  the  collection  of  the  a?s,  so  that  a  partition  of 


160 


Conditional  events  and  probability 


*7  ln  k 

Q  is  formed  by  the  aj  ...an  b  ,  ij ,  k  e  {0, 1}.  Label  these  sets  in  some  order,  say  Cj, 

j  =  1, m  ($  2n+l),  and  set  Xj  =  P(Cj).  Let  M  be  the  (n  +  2)  by  m  matrix  (with  first 
row  consisting  of  all  V  s)  representing  the  true/false  values  of  the  a',  s  and  b  in  the  c.'s, 

J 

let 


and  let 


it  = 


7T 


n 

P(b) 


Formally,  to  solve  for  P(b)  ,  delete  the  last  row  of  M  (corresponding  to  true/false  values 
Mn+2j  »  J  ~  ^  of  b  in  the  cjs),  and  P(b)  in  n  .  Then  solve  for  X  in  the 

equation  MX  =  n.  If  a  solution  X  is  found,  then 


m 

j=l 

To  obtain  bounds  for  P(b),  a  similar  procedure  as  in  Hailperin’s  work  is  used. 

In  the  following,  we  will  first  determine  best  upper  and  lower  bounds  for  basic 
connective  A  and  V  in  the  conditional  case,  then  proceed  to  outline  a  generalization  of 
Nilsson's  computational  procedure  to  a  conditional  setting.  Specifically,  we  are  seeking 
best  lower  and  upper  bounds  for  P((a|&)-(c|d))  and  P((a\b)  V  (c\d)).  First,  observe 
that 


Thus 


(a\b)-(c\d)  £  (a\b),  (c\d). 


P((a\b)-(c\d))  5  min^'^b),  P(c\d)) , 


by  Theorem  1,  and  equality  is  achieved,  say,  when  {a\b)  <  (c\d).  Next, 
(a\b)  V  ( c\d)  =  (ab\b)  V  ( cd\d)  =  {abS  cd\ab  V  cdv  bd) , 


so  that 
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P((a\b)V(c\d))  =  P(d>Vcd)  ^ 

P(  abVccNbd) 

P(abVcd)  P(ab)+P(cd)  ^P(a\b)  +P(c\d) 

P(bd)  P(bT)  P(d\b)  P(b\d)  * 

Hence 

P((a\b)  V  (c\dj)  4  mn{l,  P(a\b)  +  P(c\d)  }  . 

P(d|b)  P(b|d) 

The  fact  that  this  is  the  best  upper  bound  follows  from  abed  =  0  and  abV  cd£  bd. 

Note  that,  since  ?  is  not  additive  on  R  |i?,  this  upper  bound  is  not  a  function  of  P(a  |  b) 
and  P(c\d)  alone.  Obviously,  when  b-d-1,  it  reduces  to  the  bound  in  the 
unconditional  case.  However,  as  in  the  unconditional  case,  we  still  have 

((a\b)  V  (c\d)Y  =  (a\b)'-(c\d)'  =  (a'\b)-(cf  | d) , 
and 

P(a\b)'  P(a' \b)  =  1  -  P(a\b) , 

so  that  the  lower  bounds  for  P({a\b)-(c\d))  and  P((a|6)  V  (c|ti))  can  be  obtained  from 
the  upper  bounds  of  P((a'\b)  V  (c'  |d))  and  P((a'  \b)-(c'  \d)) ,  respectively  as 

P((a\b)-(c\d))  >  1  - mn[l, +  p<e'W  }  =  max(0, s  +  t-I ), 

P(b\b)  P(b\d) 

where 

s  =  [P(a\b)  +  P(d\b)  -  1]/P(d\b) , 

t  =  [P(c\d)  +  P(b\d)  -  l]IP(b\d) , 
and 

P((a\b)  V(c\d))  >  1-  min[P(a'  \ b),  P(c' \d))  =  max{P(a|&),  P(c\d)}  . 

Turning  to  computational  procedures  in  the  conditional  case,  we  first  observe  that  a 
conditional  event  ( a  |  b)  (with  a  <b)  generates  a  partition  of  1  consisting  of  the  three 
sets  ab,  a'b,  and  b' .  (since  a  <  b  implies  ab'  =  0  .and  a'b'  =  b').  Also, 

(*) 


P(a\b)  =  P{ab)  +  P(a\b)P(b') . 
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To  put  it  differently, 

■P(ab)  ■ 

P(a\b)  =  [1  0  P(a\b)]  P(a'b)  . 

P(b')  . 

Thus,  consistent  with  three-valued  logic  framework,  for  a\b,  we  assign  the  truth  value  1 
on  the  set  of  possible  worlds  ab,  the  value  0  on  a'b  and  the  value  P(a\b)  on  b'. 
See  Chapter  7.  -More  generally,  consider  n  conditional  events  (a. ] bp,  i  =  1, 2, ...»  n. 

i  i  i  j  j  jn 

The  associated  partition  of  1  consists  of  m  sets  of  possible  worlds  a /...a  b,  ...b 

1  n  1  n 

with  m  <  2^n.  Label  these  possible  worlds  as  Cj,  j  =  1, 2, m.  The  "truth  values"  of 
each  (a- 1 bp  in  these  Cj  are  determined  as  follows. 


tia^bj)  = 


if  Cj^a'.b. 
if  cj<,aibi  . 


I  ^  c^b\  . 

Thus,  if  we  let  the  "truth  values"  matrix  M  =  [t{a.\bj)],  n-  =  Pia^bp,  Xj  =  P(cj), 


then  K  =  MX. 

Now  if  (cjd)  is  a  conditional  event  of  interest,  and  it  is  desired  to  compute  or 
approximate  P(c\d)  in  terms  of  the  itj s  (conditional  beliefs),  that  is,  to  see  how  strong 
(c\d)  is  entailed  probabilistically  by  these  conditional  beliefs,  one  proceeds  exactly  as  in 
the  unconditional  case.  Specifically,  add  (c\d)  to  the  (a.  j  bp’s  and  consider  the 

collection  of  sets  of  possible  worlds  ...a  ...b^lccd!'  (of  m  elements, 

m  <■  2^n+^).  Label  these  elements  as  Cj,  j  =  1, ...,  m.  Add  the  top  row  consisting  of 
all  i's  to  A/,  and  1  to  the  top  row  of  n.  Solve  for  X  (where  Xj  =  P(cj))  in 

n  =  MX . 
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Then  P{c\d)  =  NX  where  N  is  a  1  x  m  row  vector  whose  entries  are  truth  values  of 
(c|d)  on  the  cjs. 

5.3  Random  conditional  objects 
L  Conditional  random  variables 

Conceptually,  random  variables  are  well-behaved  numerical  (scalar  or  vector-valued) 
functions  whose  domain  is  some  initial  conceptualized  space  reflecting  the  problem  of 
interest  Mathematically,  one  begins  with  a  probability  space  (Q,  ,  P)  and  two 

random  variables  say  X  :  £2  R771,  Y :  Cl  -*  R71.  The  relevant  joint  random  variable  here  is 

[Rm+/I  where,  for  any  aefl, 

(X,Y)(co)  =  QC(co),Y(cd)). 

Then,  X  and  Y  can  be  considered  marginal  random  variables  relative  to  (X,  Y),  with  all 
three  inducing  probability  spaces 

(f1,  Bm,  PoX'1) ,  OR",  Bn,  PoY1)  ,  OR7”4-",  Bm+n,  Po(X,  Y)J\ 

where  is  the  real  Borel  field  of  subsets  over  jfc-dimensional  Euclidean  space  IR^.  For 

any  sets  aj  e  I?1,  a2  e  and. 

X'1  :  B771  -<  j6, 

Y1  :  B71  -  jt, 
and 

( X ,  Y)'1  :  B771*71  -*  jt , 

we  have 

(PoQC,  Y)'1)(a]  x  a2)  =  PQCha}  n  Y{a2 )) , 

(?oX‘7)(a;)  =  PQC'haj))  =  PoQC,  Y)1){a1  x  k")  , 
and 

(Por7)(fl2)  =  p(r;(a2))  =  cp°cy,  r))‘i(Rm  x  a2) . 

Note,  also,  using  the  notation  k-  {bp  bp  e  ( Bm+n)r  and  obvious  notation  to 
indicate  arbitrary  combinations  of  basic  operators,  that  any  Boolean  operator  over  B7 71+71 


164 


Conditional  events  and  probability 


is  preserved  by  (X,  Y)"7  with  corresponding  evaluation 

(Po(X,  Y)'^)(comb(<\  u,  ';£»  =  P(comb(<\  u, (X,  Y)'7  (&))), 


where 


(x,  Y)'1®  =  ((x,  ...» cx,  y)*J(&p)  e  ar . 

Consider  again  a2  e  B*,  with  PQT1^))  >  0 .  Then  one  defines  the  single 


conditional  random  variable 


&\r1(a?)):r1(a2)*tfn 


(X I r;(n2))(0))  =  X(fl>),  ©  e  r1  (a2) . 

That  is,  (X|y"7(a2))  is  the  restriction  of  X  to  Y'\a2).  In  turn,  (XjY  7(^2))  induces  the 
(conditional)  probability  space  QiiX^ia^j),  ^ | Y;a  ’  ^X | Y;a2  on  i^  ranSe>  where  one 


assumes 


X(fha2))  =  {X(m) :  toe  r7(a2)}  =  {X(ffl) :  Y(a)  e  a2)  e 
and 

■6c\Y;a=  X<rJ(a2))ABm=  (X(r;(a2)) " b  :  J> £  B™)  • 

Now,  for  any  e  B™,  and  hence  for  any  b  =  X(Y"7(a2)) 

(Xjr7(c2))'7(b)  =  X'7(b)  n  r7(fl2) 

=  X'7(c7)  n  r7(c2)  =  (X,  Y)'1{a]  x  o2) . 


Hence, 


px\Y;a2{b)  =  P°0C\r1(a2)y1(b)/P0rl(a2)) 

=  P(X*7(a;)  n  r7(a2))/P(r7(o2)) 

=  Po(X,  y)'7(flixc2)/P(r7(fl2)) 


=  P(X-7(ai)|r7(a2)) 


Our  approach  here  of  viewing  a  single-conditional  random  variable  (X|7*7(a))  as 
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the  restriction  of  X  to  the  fixed  event  I r^(a)  is  the  same  as  Rinyi's  (1970,  p.  72) 
random  variable  (defined  on  a  conditional  probability  space)  with  respect  to  the  condition 

f\a).  The  double  conditional  random  variable  is  given  by  (X|  Y^(*))  •  £2^  y-»  ft772, 
where 


{a2:a2^J>arl(a2))>0}  2  2 

and  for  any  a2  e  B71  with  PQTlia^j)  >0;  (oe  TT^(a2), 

(x| r7(.»te  [a2))  =  oc\rha2  m.  . 

Hence,  for  any  a2  €  Bn  with  P{T^(a2))  >  0  and  for  any  a,  e  fP1, 

(xi  rh-yfhaj,  Cn2»  =  <x|  r1^))'1^) . 


(2) 


The  extension  of  single  and  double  conditional  random  variables  to  include  sets  a2 

of  probability  measure  zero  in  Bn  can  be  accomplished  through  use  of  the 
Radon-Nikodym  Theorem.  Given  all  of  the  above  standard  development  of  conditional 
random  variables,  it  is  natural  to  inquire  whether  a  direct  connection  can  be  established 
between  these  entities  and  an  appropriately  constructed  random  mechanism  over  the  class 

of  all  conditional  events.  For  any  a  e  if1,  b  e  if,  define  the  conditional  event 


ia\b)  =  {axb\f1xb), 

and  define  the  Kronecker  form  8 :  ti?  -» Q  [{0} 


if  0)j  #  co 2 
if  (Oj  =  ©2 


Similarly,  for  any  cOj,<x>2eQ.  and  s  e  f1,  t  e  \f, 

(0)j\(02)  =  ({(©;i©2)}|^x  {co2}), 
(s\t)  =  ({(s,t))\\f1x{t)), 


with 


(IR™|IRn)=  {(s|r) :  s  e  f1,  t  6  R71}  , 
iff1 12*")=  [ia\b):aelfl,beBn), 


r 
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(Q|Q)  =  {(coj  1  co2) :  (Oj,  a>2  e  ft} , 

(o4\  =  {(cjd) :  c,  de  . 

Use  also  the  convention  for  equal  exponents 

(BmxBn\B?nxBn)=  {(a\b):a,beBm*Bn) . 

Then  define  the  random  conditional  event  mapping  (Xj  Y) :  (Q|f2)  -*  (IR^JK71)  by 
(X|Y)((ffl7|c^))  =  ((X,  Y)mr  co2))lRmx y(©2)) . 

It  follows,  by  a  slight  abuse  of  notation,  omitting  the  {0}  term,  that,  extending  via 

functional  images,  (X|Y)  to  (X|7) :  (^j  -4  (B?n\Bn)  ,  one  obtains  for  any 

c,  d  €  i^, 

<X\mc\d))  =  ({(X,  mdicoj,  co2 ))  :©ie  c,to2ed}| {(0^ x  7(©2)>  :  o>2  e  d}) 

=  (^X,Y)(cnd)jlRmxy(d)), 


where 


•> 


(X,  Y)(cnd)=  {(X(o)),  Y(o)»  :oecnd}, 


7(d)  =  {7(o>) :  7(a)) :  coe  d)  . 

Next,  consider  the  inverse  mapping  for  (XJY) ,  at  any  (flj\a2)  e 

(X|7)';(ai |c2)  =  {  (c|d) :  c,de  j(,  and  (X|7)(c|d)  =  (X,  7)((X,  7)'7(a;  |a2)) } 
=  ((c\d) :  c,  d  €  ^  and  ccf  =  (X,  Y)\a^,  a2)  ,d  =  Y~\ a^} 
=  ({c:ce  ^  and  c-rf(a2)  =  X';(fl;)nri(fl2)}|r/(a2)) 

=  {X~1(aJ)\r1(a2)).  (3) 

Comparing  (l)-(3)  shows  that 

pmyyha^))  =  P0C'1(a,)ir\a2)) 


•> 
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=  (Po(X|ri(o2))-',Xai> 

=  Po(X\r1{-yfha1,{a2)),  (4), 

so  that  in  a  natural  sense  QC\Y)  and  (X|y)‘%  can  be  considered  the  equivalent 
conditional  event  random  mechanism  corresponding  to  double  conditional  random  variable 

(X|^(*))-  Moreover,  using  (3)  and  the  operation  preserving  property  of  X^  and 

it  readily  follows  that  if  /:  (E?n\Bn)r  -4  (^n\Bn)  is  any  r-ary  extended  Boolean  function, 

then  (XJY)  ^  preserves /,  that  is, 

&\Y)‘1of=fo(X\Y)-1 ,  •  (5) 

analogous  to  the  preserving  properties  of  X,  Y,  (X,  Y)  relative  to  unconditional  operators. 
In  summary,  analogous  to  how  the  double  conditional  random  variable 

and  its  operator  inverse  (X|7*^(-))"^  :  lY1  -*  ^  determine  from  the  probability  space 
(Q.,  P)  the  induced  probability  spaces 

(X(Y2(- )),  ^ j  Y; .  •  PX,Y; ■ 

one  gets  that  random  conditional  event  mapping 

(X|y):(a|Q)H(Rw|!R,z)-1(^^ 

determines  from  probability  space  (£2,  jC,  P )  the  induced  "conditional  probability"  space 

(((Rm+/J j Km+n),  (B?nxBn\tfnxBn),  P°(X\YyJ), 

where  ?  is  the  conditional  probability  extension  of  P.  That  is,  P  :  ( -*  [0, 1]  is 
defined  for  any  c,  rfe  ^c,  and  hence  (cjd)e  ^  by 

?((c|<0)  -  P{c\d)  =  P(c  n  d)IP(d) , 

provided  P(d)  >  0.  The  chief  relations  between,  and  evaluations  for,  double  conditional 
random  variables  and  random  conditional  events  are  given  in  equations  (3)  and  (4). 

13.  Random  sets 

It  now  becomes  clear  in  the  literature  of  uncertainty  in  AI  that  the  mathematical 
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concept  of  a  random  set  is  the  cornerstone  for  evidential  reasoning  (for  example,  Hestir  et 
al,  1990).  For  background  on  random  sets  see  for  example  Matheron  (1975),  Goodman 
and  Nguyen  (1985).  Below,  we  will  illustrate,  through  an  example,  the  use  of 
(measure-free)  conditional  events  and  their  algebraic  structure  in  a  problem  of  combining 
conditional  evidence.  For  more  details,  see  Nguyen  and  Rogers  (1990). 

In  a  given  problem,  we  make  the  basic  assumption  that  our  knowledge  at  each  state 
is  expressed  by  a  probability  measure.  When  new  evidence  is  obtained,  this  is  to  be 
"updated"  by  some  "combining."  We  interpret  an  evidence  as  a  realized  event  supplied  by 
some  "test"  which  might  be  merely  the  opinion  of  an  expert  This  lack  of  precision  in 
evidence  suggests  looking  at  a  less  precise  formulation  of  randomness,  namely  random 
sets.  Roughly  speaking,  a  random  set  S  is  a  measurable  function  from  (Q,  </£)  to  the 
power  set  of  some  set  0,  equipped  with  appropriate  cr-field. 

For  ease  of  reference,  we  present  below  basic  aspects  of  random  sets  in  the  context 
of  Dempster-Shafer  theory  of  belief  functions.  For  more  details  see  Hestir,  Nguyen  and 
Rogers  (1990).  A  random  set  S  on  a  space  ©  is  described  as  follows. 

Let  if  be  a  subset  of  the  power  set  &(&),  o(  if )  a  c^field  on  %  and  ( Q ,  ^6,  P) 
a  probability  space.  A  random  set  with  values  in  if  is  a  map  S  from  Q  to  if  which  is 

if)-measurable.  Briefly,  a  random  set  S  on  0  is  a  triple  (if ,  ct( if),  Q),  where 

Q  =  PS'1. 

For  a  given  0,  there  are  two  general  ways  to  specify  the  objects  making  up  a 
random  set. 

(i)  If  if  =  ^(0),  then  o(i f)  is  constructed  as  follows.  Let  Sf  be  the  collection  of 
all  finite  subsets  of  0.  For  i,j  e  J,  [i,j]  =  [x  e  ^(0)  :i<x<j),  where  <  denotes  set 
inclusion.  Let  JC=  {  [/,  j'] :  i,  j  e  ST\  .  Then  o(  if)  is  taken  to  be  the  onfield  generated 
by  JC,  denoted  as  o(  JC).  Each  probability  measure  Q  on  o(  JC)  determines  a  random 
set  with  values  in  ^(0). 

(ii)  If  ©  is  a  topological  space,  then  the  topological  structure  of  ©  can  be  taken 
into  account  For  example,  consider  the  case  0  =  R,  the  real  line,  or  more  generally,  0 
a  locally  compact  space.  Let  Jc,  $  be  the  classes  of  closed,  compact,  and  open 
subsets  of  CR,  respectively.  If  if=  that  is,  if  we  are  concerned  with  closed  random 
sets,  then  5 7  can  be  given  a  topology  x  using  the  open  subbase 

{Fe  S?:Fr\K  =  Q  for  Ke  JZ) 
and 

[F  &  S?\  F  n  G  *  0,  for  Ge  i?}. 

Then  o(  if )  =  oft)  is  the  Borel  <7- field  on  Ft  in  this  topology.  Each  probability  measure 
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on  (3,  g(t))  determines  a  closed  random  set 

As  in  the  case  of  random  vectors,  one  can  associate  with  each  random  set  S  a 
generalized  distribution  function  (GDF)  characterizing  S.  In  case  (i),  given  by  the  two 
spaces  ( Cl,^P )  and  (3(0),  c(3C),  Q),  and  the  map  S:  Q-*  3(0),  let 
3'  =  [jf  :j e  3).  Define 

by 

Fs(j')  =  P(SZj')  =  Q[0,jfl. 

It  can  be  shown  that 

F5(0)  =  Fs(0')  =  Q([0, 0'])  =  Q (P(Q))  =  1 , 

and  for  i,  j  e  3, 

IM 

=  l  l  (-l)aFs((j  V !)'), 

a=0  tei 

where  |i|  denotes  the  cardinality  of  i,  and 

*a  =  {r:r<z,  |r|  =  a} . 

Thus,  in  this  case,  a  function  F :  3 '  -*[0,  2]  uniquely  determines  a  GDF  if  and  only  if 
F(0)  =  7  and  for  all  i,j  e  3 


l  (-if  l  F«jvtmo. 

a=0  rez*a 

For  example,  if  0  is  finite,  then  3'  =  3=  3(0).  Let 

M=  l  (-l)^a  b^F(b)  £  0. 

b<a 

By  the  Mobius  inversion  formula,  we  get 

F(a)=  lm. 
b<a 

so  that  /  is  a  "density  function' on  3(0).  Define  Q  on  33(0)  by 
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n 

Q({aj,  ....  an})=  \  fiaj) . 

i-1 

The  probability  space  (&(&),  &&(&),  Q)  specifies  a  random  set  S  on  ©. 

In  case  (ii),  given  by  the  spaces  (Si,  P )  and  (S',  eft),  Q),  and  the  map 
S :  £2  -» «?■,  the  domain  of  a  GDF  F  will  be  J6*  =  {Kr :  K  e  JS\.  Let  T :  J6-*  [ 0 ,  J], 
and  T(K)  =  1  -  F(K').  As  an  application  of  Choquefs  theorem  (Matheron,  1975,  p. 
30-35),  a  function  F  on  K'  uniquely  determines  a  GDF  if  and  only  if 

(1)  7(0)  =  0, 

(2)  if  the  sequence  K^  in  Jo  decreases  to  K  in  Jo,  then  T(ATfi)  -» T(K), 

(3)  for  all  n>l,  all  K,  Kj,.~,Kn  in  JS,  the  following  functions  are 
non-negative: 


<?j(K;  Kj)  =  T(K  V  Kj)  -  T(K), 

<?2(K;  Kj,  K2)  =  <p}(K;  Kj)  -  q>j(K  V  K2;  Kj) 


(?n(K;  Kj, ...,  Kn)  =  cpn_j(K;  Kp ....  K^j)  -  <?n_j(K  V  Kj  Kj . K^j) 


Such  an  F  uniquely  determines  a  probability  measure  Q  on  (S',  0(1))  such  that  for  all 
Kz  JZ,  F(K')  =  Q((0,  K']). 


When  0  is  finite,  a  belief  function  Bel  on  ©  is  a  map  Bel :  &(&)  -*  [0, 1] 
defined  by 

Bel(a)  =  J  m(a), 
b<a 

where  the  basic  probability  assignment  function  m  satisfies:  m(S)  =  0  and 


l  m(b)  =  I. 

b-.s>(e> 

Thus,  a  belief  function  is  nothing  more  than  a  GDF  of  a  random  set  S  such  that 
P(S  =  0)  =  0.  (See  Shafer,  1990.)  Belief  functions  on  finite  sets  can  be  characterized  by 
various  set  functions.  Indeed,  let  S  be  a  non-empty  random  set  on  a  finite  set  0.  If 

Qs(a)  =  P(a  <  S) , 
then 
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Qs(a)  =  J  ms(b) . 
a<b 

Also,  by  the  Mobius  inversion  formulae  (See  Rota,  1964,  or  Aigner,  1979.), 


ms(a)  =  l  =  l  (- 1)^QS0> ) . 

b<a  a<Jb 


Finally, 

PLs(a)  -  P(S  n  a  *0)  =  1  -  P(S  na  =  0) 

=  1  -P(SZ  a')  =  I  -  Bels(a'). 

It  is  interesting  to  note  that  the  commonality  Qg  can  be  viewed  as  the  Fourier  transform 
of  (see  Thoma,  1989,  1991).  In  other  words,  is  the  "characteristic  function"  of 
S.  Of  course,  the  harmonic  analysis  involved  is  over  a  semi-group  structure. 

The  interpretation  of  belief  functions  in  terms  of  random  sets  allows  the  rigorous 
formulation  of  the  problem  of  combining  evidence,  where  each  piece  of  evidence  is 
assumed  to  be  represented  by  a  belief  function.  Specifically,  using  the  concept  of 
conditional  events,  two  (non-empty)  random  sets  Sj  arid  .S^  can  be  combined  into  one 
non-empty  random  set  (Sj  n  5^  \Sj  ft  ^ 

In  the  following,  the  range  of  S  will  be  simply  a  finite  Boolean  ring  R  or  a  finite 
subset  of  an  arbitrary  Boolean  ring  R.  In  this  case,  S  is  completely  characterized  by  its 
generalized  distribution  function  (GDF)  ,  called  in  the  literature  a  belief  function 
(Shafer,  1976).  F^(a)  =  P(S  <  a),  where  P  is  a  probability  on  (Q,  ji). 

A  typical  situation  in  the  problem  of  updating  of  knowledge  .is  the  following.  The 
measure  Pq  on  the  range  of  the  variable  of  interest  is  postulated  but  only  partial 
information  about  Pq  is  available.  This  is  the  Bayesian  case  of  incomplete  prior 
information.  Specifically,  consider  the  case  where  P  is  unknown,  but  we  are  given 
(say,  by  an  expert)  that  a,  b,  c  e  R,  a  and  c  are  ?o'in<iependent  given  c,  that 
Pq{c\U)  =  a,  P0(c\b)  =  p.  The  question  is:  what  can  be  said  about  the  values  Po(r\b) 
for  the  other  re/??  Consider  the  (Boolean)  quotient  ring  R\Rb'.  We  extract  the  prior 
information  as  foilows. 

Let  X,  Y  be  random  sets  with  values  in  the  power  set  of  R\Rb',  v-iih  ranges 
{(a\b),  {a'  jb)},  {(c | b),  (c'  | b)},  respectively.  Also,  (note  that  P  is  on  (£2,  ^)), 


P(X  =  (a\b))  =  a=l-P(X=(a'\b)), 
and 


P(Y=[:\b))  =  p  =  l-P(Y=(c,\b)). 
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We  combine  X  and  Y  through  the  random  set  Z  with  range 

{{ac\b),(ac'\bUa'c\b),(a'c'\b))t 
with  probabilities  (in  view  of  conditional  independence  assumption) 

ap,  a{l  -  p),  (!  -  a)p,  (1  -  a)(l  -  p), 

respectively.  See  Section  5.4  for  the  concept  of  qualitative  conditional  independence. 
The  GDF  of  Z  is  the  map  F^ :  %  \Rb'  — 1  [0, I]  defined  by 

Ftfr]b)=  l  r^afilb), 

(aft\b)<(r\b) 

where  P7  =  FZ as  usual,  and  where  a-  is  a  or  a',  and  c.  is  c  or  c'.  In  terms  of 

JLi  ll 

the  order  relation  <  among  conditional  events,  (af.\b)  ^  (r|b)  if  and  only  if  a^cJb^r. 
For  all  (r\b)  e  R\Rb' , 


Fjr\b)ZP&\b). 

Replacing  (r\b)  by  (r\b)'  =  (r'  \b)  in  this  inequality,  yields 

Prfr\b)£l-Ffl'\b). 

Based  on  the  available  evidence,  for  r  e  R,  an  interval  approximation  for  P${r  |  b)  is 
[Fz{r\b),  1-Fz(r'\b)}. 

5.4  Qualitative  conditional  independence 

Qualitative  independence  (or  (2-independence  for  short),  or  measure-free  or 
algebraic  independence  of  ordinary  events  is  a  well-known  concept  (for  example,  Rdnyi, 
1970).  It  is  not  simply  for  academic  interest  that  the  above  concept  should  be  extended  to 
conditional  events.  In  fact  our  motivation  for  considering  ^-independence  comes  from  the 
problem  of  fast  computations  in  inference  networks  of  expert  systems.  For  example,  in 
some  models  of  medical  diagnosis,  the  variables  of  interest  are  represented  as  nodes  in  a 
graph,  the  causal  relationships  among  these  variables  are  represented  by  (directed)  edges 
of  the  graph  and  the  strengths  of  such  relationships  are  usually  quantified  by  an 
uncertainty  measure  (Bayesian  probability,  Dempster-Shafer  belief  function,  Zadeh 
possibility  measure).  In  AI  activities,  there  is  no  general  agreement  on  the  choice  of  such 
uncertainty  measures  (see  for  example,  Henkind  and  Harrison,  1988).  Thus  the  design  of 
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inference  networks  (or  influence  diagrams)  should  be  done  without  reference  to  the 
uncertainty  measure  used  Not  only  some  sort  of  "independence"  assumption  generally 
simplifies  the  calculations  within  the  knowledge  representation,  but  by  the  very  nature  of 
many  application  domains,  neighboring  interactions  among  variables  exhibit  some  form  of 
conditional  independence.  This  is  typically  the  case  of  Markov  random  fields  (for 
example,  Lauritzen  and  Spiegelhalter,  1988). 

First,  we  give  a  brief  historical  background  The  concept  of  (2-independence  of 
events  was  treated  in  some  detail  in  Rinyi  (1970).  With  our  notation  concerning  a 
Boolean  ring  R,  two  elements  a  and  b  of  R  are  said  to  be  (2-independent  if  and  only 
if  ab,  ab' ,  a'b,  a'b'  are  all  not  0  (implying  also  that  a,a',b,b'  are  all  not  0).  One 
possible  interpretation  is  clear:  viewing  elements  of  R  as  events,  if,  for  example, 
a'b  =  0,  then  a<b  so  that  when  b  is  "realized"  a  is  also  realized,  it  follows  that  a 
and  b  cannot  be  "independent."  It  is  easy  to  check  that  P-independence  implies 
(2-independence:  P(ab )  =  P(a)P(b)  >  0  implies  ab±0,  and  the  rest  follow  by  the  use  of 
complements.  This  concept  of  (2-independence  of  non-zero  a  and  b  can  be 
reformulated  as  follows. 

Let  ida)  -  [a,  a'},  idb)  =  {b,  b'}  be  partitions  of  1.  Then  a  and  b  are 
(2-independent  if  and  only  if  for  all  a  6  tda),  and  all  j3  e  idb),  one  has  ccj3  *  0.  If  we  fix 
a  and  b  through  their  indicator  function  1  and  1^,  then  the  following  equivalent 
definition  can  be  used  to  extend  the  concept  of  Q-independence  to  variables.  The  cr-field 
generated  by  1Q  is  o(la)  =  { 0 , 1,  a,  a')\  similarly,  o{a p  =  {0, 1,  b,  b').  Then  a  and 
b  are  (2-independent  if  and  only  if  for  all  a  e  o(ifl)\{0},  and  all  j3  e  o(7p\{0),  one  has 
a/3  *0. 

Next,  to  be  concrete,  let  R  be  a  onfield  of  subsets  of  some  set  £2.  Let  X  and  Y 
be  measurable  functions,  defined  on  (£2,  R),  with  countable  ranges  in  the  real  line  [R.  Let 
the  countable  partitions  (with  no  empty  subsets)  generated  by  X  and  Y  be 
correspondingly, 

nQO  =  { an }  ,  m  =  { bm)  . 

Then  X  and  Y  are  said  to  be  (2-independent  if  and  only  if  7t(X)  and  n{Y)  are 
(2-independent  in  the  sense  that  for  all  a  e  id}. Q,  j3  e  tu(Y),  one  has  aj3  *  0.  Note  that  the 
cr-field  generated  by  X  is 


oQO  =  (ufl.:/cX(£2)}, 

16/  ‘  ' 

where  a.  =  X~\i)  for  i  €  X(Q),  and  I  could  be  0.  From  this,  X  and  Y  arc 
(2-independent  if  and  only  if  for  all  a  e  o(2OM0),  P  e  o(y)M0}»  one  has  a/3  £  0. 


174 


Conditional  events  and  probability 


Recently,  Shafer  et  al  (1987),  also  motivated  by  the  study  of  inference  networks  in 
expert  systems,  defined  ^-conditional  independence  for  finite  partitions,  or  equivalently, 
for  variables  with  finite  ranges.  They  did  not  consider  the  concept  of  (2-conditional 
independence  for  events.  As  far  as  we  know,  ^-independence  of  "continuous"  variables 
has  not  been  discussed  in  the  literature;  also,  (2-conditional  independence  was  not 
addressed  in  Rdnyi  (1970).  Below,  we  follow  the  recent  work  of  Nguyen  and  Rogers 
(1990)  to  present  a  comprehensive  discussion  of  all  the  above  mentioned  notions. 

Back  to  the  abstract  setting,  let  R(V,  A,  ',  0, 1)  be  a  Boolean  ring. 

Definition  1. 

(0  Let  A  and  B  be  two  subsets  of  R  consisting  of  non-zero  elements.  Then 

A  i  B  if  and  only  if  for  az  A  and  btB,we  have  ab#0. 

R 

(if)  Let  a,  b  e  R.  Then  a  ±b  if  and  only  if 

R 

n (a)  =  [a,  a'}  x  { b,b '}  =  n(b). 

R 

(iii)  Let  X  and  Y  be  discrete  variables.  Then  X  x  Y  if  and  only  if  n(X)  x  rcfY) 

R  R 

if  and  only  if  for  a  e  o(X)\{0}  and  b  €  c(y)\{0},  we  have  ab*0  . 

For  the  concept  of  (2-conditional  independence  of  events,  we  observe  that 
P(probabilistic)-conditional  independence  of  a  and  b  given  c  is  expressed  by  the 
formula 


P(ab\c)  =  P(a\c)P(b\c), 


which  can  be  rewritten  as 


•  ?((ab\c))  =  P((a\c))P((b\c)), 

where  we  use  P  as  a  function  on  R\R  with  arguments  (ab\c),  (a |c),  {b\c), ...,  viewed 
as  conditional  events.  This  suggests  that  one  could  define  ( a\b )  and  (c\d)  to  be 

independent  with  respect  to  P  if  and  only  if 


P((a\b)-(c\d))  =  ?((a\b))?((c\d)) . 
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Definition  2. 

(0  Let  a,b,ce  R.  Then  a  x  b\c  if  and  only  if 

P 

P(ab\c)  =  P(a\c)P(b\c) 

if  and  only  if 

Ha\c){b\c))  =  Ha\cp(b\c). 

( ii )  Let  a,  b,  c  e  R.  Then  a  and  b  are  Q-independent  given  c,  in  symbols, 

a  x  b\c,  if  and  only  if  (ale)  x  (61c),  if  and  only  if  for  (ale)  e  Tt{a\c)  = 
Q  R/Rc' 

{(a|c),  (a'  |c)} ,  and  for  (j3|c)  e  7t(b\c)  =  {(6{c),  6'  |c)}  , 
we  have 

(«|cH0|e)#(0|c). 

Remark.  (a|c)-(/3|c)  ^  (0|  c)  is  equivalent  to  (ap  |  c)  *  (0 1  c)  or  to  ape  *  0.  Also,  in 
considering  %{a  |  c),  it  is  implicitly  assumed  that  (a|c)  and  (a'  |c)  are  not  (0\c)  which 
is  equivalent  to  ac*0  and  a' c*0.  Moreover,  it  can  be  checked  that  the  ^-conditional 
independence  in  Definition  2  (ii)  is  strictly  weaker  than  that  for  finite  partitions  in  Shafer 
et  al  (1987).  Indeed,  in  our  notation,  their  definition  is  expressed  as: 

Let  X,  Y  and  Z  be  discrete  variables.  Let 

A(c,  7t(X))  =  {(a|c) :  a  e  TtfX),  etc  ±  0). 

Then,  X  ±  Y\Z  if  and  only  if  7t(X)  ±  7c(Y)\7t(Z)  if  and  only  if  for  c  e  7z(Z), 

Q  Q 

A(c,  kQC))  x  A(c,  n(Y)) . 

R/Rc' 

We  see  immediately  that  if  1  x  1,  \1  according  to  this  last  definition,  then  a  ±  b\c 
..  Q  Q 

according  to  definition  2  (ii),  but  that  the  converse  does  not  hold.  Thus,  unlike  the 
unconditional  case,  2-conditional  independence  of  events  cannot  be  defined  in  tenns  of 
variables.  In  order  to  define  a  ^-independence  which  will  be  compatible  with  stochastic 
independence  for  "continuous"  variables,  it  is  necessary  to  pay  attention  to  "small  sets." 
In  probability  theory,  these  are  P-null  sets  which  form  a  o-ideal  of  subsets.  This  structure 
is  abstracted  to  cr-ideals  (for  example,  Halmos,  1963),  a  notion  dual  to  that  of  the  "bunch" 
in  Renyi  (1970).  See  Section  5.1  . 
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In  the  discrete  case,  one  needs  to  consider  only  the  trivial  oideal  {0}.  If  P  is  a 

probability  measure  on  R,  then  JCp  =  [a  e  R  :  P(a)  =  0}  is  clearly  a  oideal.  Now 

X  xY  if  and  only  if  c(X)  ±  o(Y)  in  the  sense  that  for  a  e  cfX)  and  b  e  o(Y), 
P  P 

P(ab )  =  P(a)P{b).  If  a  e  oQCfJCp  and  b  e  o{Y)\J(p,  then  P(ab )  >  0  which  implies 
that  ab£  JCp  so  that  ab*0.  If  we  were  to  require  only  that  a  e  o(X)\{0),  it  could 
happen  that  either  P{a)  =  0  or  ab  =  0.  Thus  we  are  led  to 


Definition  3.  Let  X,  Y,  Z  be  real-valued  measurable  functions  ( defined ,  say,  on  (Q,  R)). 

( i )  X  x  Y  if  and  only  if  there  is  a  o-ideal  M.  such  that  for  a  6  oQC)\J '(  and 

Q 

b  e  atfhJC,  we  have  ab±0. 

O'i)  X  x  Y\Z  if  there  is  a  o-ideal  JC  such  that  for  a  e  oQCfJC,  b  6  o{Y)\Jt,  and 

Q 

c  e  o(Z)\JC  ,  we  have  a  x  b\c. 

Q 
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CHAPTER  6 

CONDITIONAL  PROBABILITY  LOGIC 

Unlike  Adams'  approach  to  a  logic  of  conditionals  (Adams,  1975),  we  will  take 
advantage  of  the  rich  algebraic  structure  of  the  space  of  conditional  events  F|F  to 
develop  a  conditional  probability  logic  (CPL).  The  concrete  syntactic  component  of  this 
logic  is  especially  useful  for  the  purpose  of  automation.  The  problem  of  modeling 
defaults  and  production  rules  in  expert  systems  using  measure-free  conditionals  as  well  as 
aspects  of  non-monotonic  deduction  will  be  discussed  in  Chapter  8. 

6.1  Essentials  of  probability  logic 

In  a  sense,  logic  is  about  the  study  of  knowledge  representation  languages  in  which 
the  basic  notion  of  entailment  (for  inference)  can  be  captured.  We  are  concerned  here 
with  the  situan_.i  in  which  the  uncertainty  in  our  knowledge  is  taken  in  a  quantitative 
way.  See  for  example,  Bibel  (1986)  for  general  methods  of  automated  reasoning. 

However,  because  of  the  relevancy  to  the  treatment  of  conditional  events,  we 
address  only  the  probability  logic  approach  to  managing  quantitative  uncertainty  in  expert 
systems.  See,  for  example,  Bibel  (1986),  Pearl  (1988)  for  both  Bayesian  and 
non-Bayesian  formalisms.  The  so-called  probabilistic  logic  (Nilsson,  1986)  in  AI  has 
been  discussed  in  Chapter  5,  together  with  an  extension  to  the  conditional  case.  In  this 
chapter,  we  are  concerned  with  probability  logic  and  its  extension  to  conditional 
probability  logic  from  the  viewpoint  of  mathematical  logic.  Since  the  CPL  developed  in 
this  chapter  is  a  direct  extension  of  probability  logic  (PL),  we  will  first  review  the  basics 
of  the  latter.  We  start  with  a  review  of  classical  two-valued  logic 

In  the  base  space  is  a  Boolean  ring  R  (representing  propositions)  with  its  usual 
operators  and  relations.  Taking  the  concept  of  truth  as  the  (only)  primitive  notion,  one 
proceeds  to  derive  the  concept  of  logical  entailment.  Each  element  of  R  is  either  true  (7 
or  1)  or  false  (F  or  0),  that  is  the  truth-space  of  R  is  {0,/}.  To  emphasize  the  fact 
that  elements  of  R  are  true  or  false  on  different  "possible  worlds"  one  introduces  the 
concept  of  models.  Roughly  speaking,  a  model  (or  semantic  valuation)  of  R  is  an 
assignment  of  truth  values  to  elements  of  R.  However,  such  an  assignment  should  be 
logical  (or  consistent),  that  is,  it  should  be  such  that  no  element  of  R  could  be 
simultaneously  true  and  false  in  the  same  assignment.  Further  two  elements  <3,  b  are 
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both  true  if  and  only  if  their  conjunction  ab  is  true.  The  mathematical  translation  of  the 
concept  of  consistent  assignments  is  that  of  a  Boolean  homomorphism.  The  truth-space 
[0, 1}  is  viewed  as  a  2-element  Boolean  algebra.  That  is,  for  x,y  £  {0,  i}, 

xy  =  min{x,y}  , 
x  V  y  =  max[x,  y)  , 

O'  =1,  l'=0. 

We  use  the  same  notation  A  (or  •)>  and  V  on  both  the  spaces  R  and  {0,1}.  A  map 
h  :  R  -» [0, 1 }  is  a  (Boolean)  homomorphism  if  for  a,  be  R 

Ka')  =  [h(a)Y, 

h(ab)  =  h(a)h(b), 
and 

h(a  V  b)  =  h(a)  V  h(b). 

The  first  condition  is  equivalent  to  h(a)  #  h(a')). 

A  model  is  defined  to  be  a  homomorphism  R  -»  {0,  1),  and  we  denote  the  set  of  all 
models  of  R  by  H.  Thus  an  element  aeR  is  true  in  the  model  he  H  if  and  only  if 
h(a )  -  1. 

For  further  syntactic  development,  and  for  concreteness,  we  look  at  an  alternative 
way  of  formalizing  the  concept  of  models.  For  elementary  background  on  ideals  and 
filters,  as  well  as  some  algebraic  logic,  see  for  example,  Mendelson,  (1970),  or  Halmos, 
(1962,  1963).  Since  each  h:R-*  [0, 1}  can  be  identified  with  a  subset  of  R,  namely 

we  can  consider  the  space  Cl  =  [h  ^( 1 ) :  h  6  H}  as  that  of  all  models  of  R.  We 

describe  now  the  elements  of  Cl.  Let  a >  =  h'  \l).  Then  first,  (0  c  R,  is  a  filter  of  the  ring 
R.  That  is, 

(1)  1  e  (0  (/  is  the  greatest  element  of  R), 

(2)  If  a,  b  e  0)  then  ab  e  co,  and 

(3)  If  a  e  a)  and  b  e  R,  then  ay  be  0). 

Let  ae  0).  Then  a  =  a- 1  and  h(a)  =  h{a)h{l),  implying  that  h{l)  =  1,  that  is,  that 
1  e  o=  h'\l).  If  a,  b  e  co,  then  h{ab )  =  h{a)h(b)  =  1,  so  ab  e  co.  For  (3),  a  =  a(aV  b ), 
so  that  1  =  h(a)  =  h{a)h(a  V  b)  =  h{a  V  b ).  Thus  =  h'1  {1)  is  a  filter. 
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Moreover,  each  co=  h^(l)  is  maximal ,  that  is,  G)  is  a  proper  filter,  meaning  that 
G)  *  R,  cr  equivalently,  that  0  i  G),  and  if  ycR  is  a  filter  such  that  G)  c  %  then  either 
7=  G)  or  y=R.  Since  h(l)  =  1  e  G),  /z(0)  =  h(l')  =  J'  =  0,  whence  0  e  G),  and  G)  is 
proper.  Let  y  be  a  filter  such  that  G)  c  y.  If  G)  £  y,  then  there  exists  &  6  y  with  bid). 
Then  h(b )  =  0,  so  h(b')  =  1  and  6'eg)  and  hence  b'  e  y.  But  then,  since  y  is  a  filter, 
bb'  =  0e  y,  and  y=R. 

Thus,  elements  of  £2  are  maximal  filters  of  R.  In  fact,  all  maximal  filters  of  R 
can  be  described  by  homomorphisms,  that  is,  £2  is  the  set  of  all  maximal  filters  of  R.  To 
see  this,  it  suffices  to  show  that  if  y  is  a  maximal  filter  of  R ,  then  its  indicator  function 
ly- R  -» { 0 ,  7},  defined  by  lja)  =  7  or  0  according  as  to  whether  a  e  y  or  a  e  y,  is  a 
homomorphsim.  The  condition  that  h(a)  *  h(a')  turns  out  to  be  a  characterization  of 
maximality  for  filters. 

Lemma  1.  A  filter  y  of  R  is  maximal  if  and  only  if  for  a  e  R,  either  a  e  y  or  a'  e  y 
(but  not  both). 

Proof.  Suppose  that  y  is  a  maximal  filter  and  that  b  e  y.  Then 

P  =  [xy  :  x  e  y,  b  <  y} 

is  a  filter).  Taking  y  =  1  gets  yc  p.  Taking  x  =  1  and  y  =  b  gets  b  e  p.  Thus  j3 
strictly  contains  y,  and  thus  j3  =  /?.  Hence  0  =  xy  for  some  re  y  and  y  >  b,  and  so 
x<y'  <b'.  Thus  b'  e  y 

The  proof  of  the  converse  parallels  the  proof  above  that  G)  =  /f  ^(7)  is  maximal. 

From  the  lemma  above,  it  is  easy  to  check  that  indicator  functions  of  maximal  filters 
are  homomorphisms.  Indeed,  by  Lemma  1,  Lfa')  =  [7^)]'.  For  a,  be  R,  we  have 
1-fab)  =  7  if  and  only  if  abe  y  if  and  only  if  a,  b  e  y,  l^ab)  =  l-fa)l^b).  Similarly, 
l^a  V  b)  =  ljfl)  V  1-fb),  and  7y  is  a  homomorphism. 

Regarding  the  set  £2  of  maximal  filters  of  R  as  the  set  of  models  of  R,  an 
element  a  e  R  is  true  in  a  model  G)  e  £2  if  a  e  CD. 

Remarks 

1.  Since  filters  and  ideals  are  dual  in  the  sense  that  if  a  is  a  filter  of  R,  then 
a'  =  [x'  :xe  a}  is  an  ideal,  and  if  y  is  an  ideal,  then  y '  =  (x'  :  x  e  y)  is  a  filter,  the 
classical  Stone  Representation  Theorem  for  Boolean  rings  can  be  also  stated  in  terms  of 
maximal  filters  (that  is,  models).  Specifically,  define  y.R-i  ^(£2),  power  set  of  £2,  by 


180 


Conditional  probability  logic 


yrtja)  =  {co  e  Q  :  a  e  co)  . 

Then  y/(0)  =  0,  =  Q,  and  for  a  &  0,  yr(a)  £  0.  (The  third  property  is  not  a  trivial 

one.  See  the  second  remark  below.)  By  maximality,  for  coe  Q  and  a  e  R,  if  a  e  co 
then  a'  e  co,  so  that 


V<a')  =  IWif- 

(We  use  (-)c.  n,  and  u  for  set  operations  on  ,?(Q)).  Also,  a,  b  e  6>  if  and  only  if 
ab  6  o,  implying 


y/(a£0  =  yr(a)  n  y(b) . 

This,  and  DeMorgan's  laws  readily  yield 

y^a  v  b)=  yr{a)  u  y{b). 

Hence  yr  is  a  homomorphism  from  i?  into  yr  is  one-to-one  since  yf*  (8)  =  0. 

Thus  an  appropriate  subset  of  models  is  identified  with  a  proposition  in  R,  namely,  a 
proposition  a  is  identified  with  the  set  of  models  (0  in  which  a  is  true. 

2.  The  characterization  of  maximality  in  Lemma  1  is  a  property  shared  by  atoms  of 
R.  For  a  e  R,  a  &0,  and  an  atom  a,  we  have  either  a<  a  or  a<a'  (but  not  both).  In 
fact,  the  principal  filter  /?Vcc={rVa:re/?}  general  by  an  atom  a  is  maximal. 
Moreover,  a  is  the  unique  atom  in  RV  a.  In  general,  the  class  of  all  maximal  filters  Q. 
of  R  is  larger  than  that  of  these  principal  maximal  filters.  However,  if  the  ring  R  is  such 
that  every  maximal  filter  is  principal,  then  they  coincide.  That  is,  R  V  a  is  maximal  if 
and  only  if  a  is  an  atom.  Indeed,  if  b  <  a,  then  R  V  b  properly  contains  R  V  a,  b 
being  in  the  former  and  not  in  the  latter.  For  example,  if  R  is  finite,  then  models  of  R 
can  be  identified  with  atoms  of  R. 

We  continue  now  with  the  basics  of  C2.  For  deduction,  we  consider  the  concept  of 
logical  entailment  relation,  denoted  by  h  Roughly  speaking  b  logically  entails  a,  in 
symbol  b¥  a,  if  whenever  b  is  true,  a  is  true.  In  our  setting  here,  this  means  that  b  h  a 
if  and  only  if  for  co  €  Cl,  if  b  e  (0,  then  a  6  co .  The  following  fact  is  well-known. 

Lemma  2.  b¥  a  if  and  only  if  b<a. 

Proof.  Suppose  b  <  a.  For  co  e  Q.  such  that  b  e  co,  we  have  a  =  b  V  a  s  co,  since 
co  is  a  filter. 

Conversely,  suppose  b  h  a.  For  each  co  e  Q.  such  that  be  co,  we  have,  by 
hypothesis,  a  e  co,  and  hence  ab  6  co  since  co  is  a  filter.  We  are  going  to  show  that 
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b  =  ab.  Suppose  ab<b,  that  is,  b(ab)'&0.  But  then,  there  is  ye  Cl  such  that 
b(ab)'  e  y.  Now  b(ab)'  <  b  implying  hey,  b(ab)'  <  ( ab )'  implying  ( ab )'  e  y,  that  is, 
ab  €  y  since  co  is  maximal,  which  is  a  contradiction. 

Remarks 

1.  In  the  proof  above,  we  have  used  the  following  well-known  fact.  If  x  e  R  and 
x  *  0,  then  there  is  a  maximal  filter  co  such  that  x  e  co.  R  V  x  is  a  filter  containing  x, 
and  this  filter  can  be  enlarged  to  a  maximal  one.  That  statement  is  not  at  all  obvious, 
involving  set  theoretical  niceties  such  as  Zom’s  lemma.  It  should  be  noted  also  that  for 
each  element  x  ?  7,  there  exists  a  maximal  filter  not  containing  x.  Indeed,  any  maximal 
filter  containing  x'  has  that  property.  In  particular,  the  only  element  contained  in  every 
maximal  filter  is  I. 

2.  A  simple  proof  that  every  non-zero  element  of  R  is  contained  in  a  maximal 
filter,  in  the  case  of  atomic  R,  goes  as  follows.  As  noted,  R  V  x  is  a  filter  containing  x, 
and  since  R  is  atomic,  there  is  an  atom  y  with  y  <  x.  Then  R  V  y  is  a  maximal  filter 
containing  x. 

3.  It  is  obvious  that  b<a  if  and  only  if  b  -*a  =  b'  V  a  =  1,  that  is,  b  a  is  a 
tautology.  (An  element  x  is  a  tautology  if  for  every  co  e  Q.,  x  e  co.  Thus  the  only 
tautology  is  I).  Lemma  2  expressed  the  logical  entailment  relation  H  in  classical 
two-valued  logic  in  terms  of  the  (partial)  order  relation  <  This  explains  the  monotonicity 
of  H  (due  to  the  transitivity  property  of  <).  For  more  details,  see  Chapter  8. 

Now  to  Probability  Logic  (PL).  PL,  as  a  multi-valued  logic,  has  been  treated,  for 
example,  in  Rescher  (1969),  Hailperin  (1984),  Nilsson  (1986).  See  also  Goodman  and 
Nguyen  (1985).  The  formal  language  of  PL  is  the  same  as  that  of  C 2-  Thus  the  base 
space  of  PL  is  also  a  Boolean  ring  R.  As  far  as  AI  is  concerned,  there  is  a  need  to 
generalize  C2  to  PL  in  order  to  reason  with  uncertain  information,  such  as  in  expert 
systems. 

For  each  sentence  a  e  R,  there  are  two  sets  of  "possible  worlds"  (that  is,  models): 
[co:  a  e  ©},  and  [co:  aeco}.  Not  knowing  the  actual  model,  one  considers  the 
probability  of  a  being  true  as  a  "truth  value"  for  a.  This  is  obviously  a  generalization  of 
C0.  In  view  of  the  axioms  of  probability  measures  on  R ,  PL,  with  truth-space  the  unit 
interval  [0, 1),  is  a  non-truth  functional  system.  A  model  for  PL  is  simply  a  probability 
measure  P  on  R. 

In  view  of  Stone’s  Representation  Theorem  (in  terms  of  maximal  filters  of  R), 
models  for  PL  can  be  also  viewed  as  probability  measures  on  a  class  of  subsets  of  models 
in  C2.  Also,  with  its  axioms,  each  probability  measure  P  on  R  acts  like  a  "homomor¬ 
phism-like"  map.  As  in  classical  deduction,  the  concept  of  probabilistic  entailment 
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relation  is  crucial  for  probabilistic  reasoning  in  intelligent  systems  (for  example.  Pearl, 
1988;  Neapolitan,  1990).  See  also  Hailperin  (1984),  Nilsson  (1986). 

P 

We  say  that  a  is  probabilistically  entailed  by  b,  in  symbols  b  H  a,  if  for  all 
probability  measures  on  R,  P(b)  £  P(a).  In  view  of  Lemma  2  of  Section  2.2,  this  is 
equivalent  to  b<  a  or  bh  a.  Also,  a  6  R  is  a  probability  tautology  if  P(a)  =  1,  for  all 
probability  measures  P  on  R.  Again,  by  Lemma  2  of  Section  2.2,  this  mean  that  a=  1. 

At  a  practical  level,  probabilistic  entailment  is  defined  as  the  computation  of  the 
probability  of  a  sentence  in  terms  of  the  probability  values  of  other  sentences.  As 
Hawthorne  (1988)  stated  clearly,  this  entailment  is  in  fact  a  "partial"  entailment,  that  is, 
entailment  with  "degrees."  This  is  precisely  the  problem  of  combination  of  (probabilistic) 
evidence.  The  decision  as  whether  or  not  to  "infer”  a  from  the  b! s  depends  upon  the 
magnitude  of  P(a).  A  computational  procedure  for  this  problem  is  given  in  Nilsson 
(1986).  See  also  McLeish  (1988),  and  Section  5.2.  For  discussions  concerning  PL  and 
non-monotonic  logics,  see  for  example,  Grosof  (1988),  Hawthorne  (1988),  and  Chapter  8, 
Section  8.2 . 

Probability  logic  is  sound  and  complete.  We  close  this  section  with  the  concepts  of 
truth  semantics  and  of  probabilistic  entailment  in  the  conditional  case  (Adams,  1975). 
This  will  be  served  as  a  comparison  with  our  development  of  conditional  probability  logic 
in  the  next  section. 

First,  we  take  this  opportunity  to  clarify  several  basic  aspects  in  Adams'  book,  in 
view  of  the  mathematical  development  of  the  conditional  space  R\R  and  its  associated 
three-valued  logic  (Chapters  2,  3).  By  Lewis'  Triviality  Result,  it  is  seen  that  if  we  assign 
conditional  probabilities  to  "indicative  conditionals,"  then  these  conditionals,  at  the  syntax 
level,  are  not,  in  general,  elements  of  the  Boolean  ring  R.  This  fact  is  expressed  in  Pearl’s 
book  as  "conditionals  are  non-propositional"  or  "...  classical  logic  does  not  possess  an 
operator  equivalent  to  the  conditioning  bar  (■  |)  in  probability,"  (Pearl,  1988,  p.  475, 
482).  In  Adams'  book,  it  is  expressed  as  "conditional  propositions  are  not  assumed  to 
correspond  to  subsets  of  a  sample  space,"  and  as  "these  objects  do  not  have  truth  values" 
(Adams,  1975,  Preface  and  p.  9).  It  becomes  clear  that,  under  the  fundamental  assumption 
of  Adams'  work  (p.  3),  namely  "the  probability  of  an  indicative  conditional  is  a 
conditional  probability,"  a  conditional  "if  b  is  the  case  then  a  is",  is  a  subset  of  R 
rather  than  an  element  of  R.  As  far  as  truth  values  are  concerned,  it  is  apparent  that 
Adams  was  referring  to  classical  two-valued  logic.  Each  conditional  (a\b)  does  have 
truth  values,  namely  true  (/),  false  (0)  or  undefined  (u).  As  such,  we  agree  with  Adams 
that  "probabilities  of  conditionals  are  not  equal  to  their  probabilities  of  being  true."  All 
the  above  can  be  proved  in  our  representation  of  conditional  events  (a\b)  as  cosets  of 
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Let  P  be  a  probability  on  R.  Of  course  P(a\b)  is  a  function  of  P(ab)  and  P(b) 

(when  P(b)  >  0),  and  it  is  true  that  the  truth-values  of  (a\b),  denoted  as  t{a\b),  is  a 
function  of  t{ab)  and  t(b).  Indeed, 

(7i/  t(pb)  =  1 

't(a\b)  =  lo  if  t{a'b)  =  1  - 

U  ift(b')=  1 

where  t :  R->  [0,1]  is  a  Boolean  homomorphism.  The  knowledge  of  t(flb)  and  t(b) 

completely  specifies  t(a\b),  since  [ab,  a'b,  b'}  is  a  partition  of  1. 

The  point  is  this.  Since  (a\b)  is  not  "Boolean,"  its  truth-values  should  not  be 
restricted  to  {0,1}.  we  see  that,  with  the  truth-space  being  [0, 1,  w},  conditionals  are 
truth-functional  and  their  probabilities  are  conditional  probabilities.  On  the  other  hand, 
contrary  to  Adams'  attitude  concerning  Lewis  Triviality  Result  (Adams,  p.  35),  namely 
"The  author's  very  tentative  opinion  on  the  "right  way  out"  of  the  triviality  argument  is 
that  we  should  regard  the  inapplicability  of  probability  to  compounds  of  conditionals  as  a 
fundamental  limitation  of  probability,  on  a  par  with  the  inapplicability  of  truth  to  simple 
conditionals.  What  is  needed  at  the  present  stage  is  less  mathematical  theorizing  than 
close  examination  of  the  phenomenon  of  inference  involving  these  problematic 
constructions, ...",  we  have  resolved  these  problems  from  a  mathematical  analysis.  Indeed, 
first,  there  is  no  problem  with  compounds  of  conditionals,  since  there  is  p.o  need  to  assign 
probabilities  directly  to  such  objects.  Simple  conditionals  have  truth-values  in  { 0 , 1, 
and,  as  cosets  of  the  ring  R,  have  well-defined  probabilities  as  conditional  probabilities 
(see  Chapter  5).  Viewing  R\R  as  the  space  of  conditionals  with  three-valued  logic,  we 
can  derive  basic  connectives  on  it  (see  Section  3.4).  Given  a  system  of  truth  tables  in  a 
three-valued  logic,  there  corresponds  a  system  of  connectives  A,  V,  \  say,  on  /?]/?. 
These  connectives  are  operators  on  R|R,  that  is,  any  compound  of  conditionals  is  a 
simple  conditional,  so  that  probability  is  assigned  in  the  same  way  as  for  simple 
conditionals. 

In  our  notation,  R  is  a  factual  language,  and  /?|7?  is  its  conditional  extension,  and 
( a\b )  is  bsa,  in  Adams'  notation  for  conditionals.  Let  t :  R  -*  {0, 1}  be  a  truth 
function.  Adams  considered  the  "truth-conditional  semantics,"  that  is,  truth  evaluations  on 
R\R  as  follows. 

a)  (a  |  b)  is  "verified"  under  t  if  t(a)  =  t(bj  =  1 , 
j3)  (a  1 6)  is  "falsified"  under  t  if  t(jb)  =  I  and  t(a)  =  0. 

But,  in  our  development  of  three-valued  logic  for  RjR,  a)  and  p)  say  nothing  more  than 
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the  truth-values  of  (a  j  b)  are  2  and  0,  respectively,  in  [G,  1,  u}.  Of  course,  when  a 
conditional  (a\b)  is  neither  verified  nor  falsified,  its  truth-value  is  u.  It  is  this 
"non-verification  values”  u  which  Completes  the  discussion  concerning  semenncs  of 
conditionals. 

Finally,  as  mentioned  in  previous  chapters,  although  conditionals  are  not  treated  as 
mathematical  entities  in  Adams’  book!  Adams  did  propose  basic  connectives  among  them, 
namely  "contrary,"  "quasi-conjunction"  atsd  "quais-disjunction"  (Adams,  1915,  p.  46-47). 
These  connectives  were  proposed  earlier  by  Schay  (Schay,  196S),  and  were  rediscovered, 
in  an  independent  work,  later  by  Calabrese  (Calabrese,  19S7).  These  connectives 
correspond  precisely  to  Sobocinski's  three-valued  logic.  (See  Section  3.5.) 


Now  to  Adams’  £-semantics.  Rom  the  formal  language  Jf  (or  R),  cf  classical 

A 

two-valued  logic,  consider  its  extension  Jf  to  ’’conditional  formulas,"  denoted 
a  ~ 

Jf=  [ai  b,  a,  b  £  Jf}.  a  •-*,  b  stands  for  ?n  indicative  conditional  of  the  form  "if  a  is 
the  case  then  b  is”  in  natural  language,  for  example,  in  ordinary  English.  In  the  study  of 

*”  A 

probabilistic  semantics  for  default  reasoning  (Pearl,  1988,  Chapter  10’t,  Jf  is  the  set  of 
default  statements  wnich  arc  “non-propositional"  in  the  tense  that  they  involve  the  ’’arrow" 
4  connecting  two  propositional  formulas.  So  a  4  fc  is  non  "Boolean  "  that  is,  a  O  b  is  not 
an  element  of  Jf  or  of  the  Boolean  ring  R.  See  also  Dubois  and  Prsdc  (1989)  for  the 
modeling  of  default  rules  by  conditionals.  Also,  here  4  is  net  the  material  implication 
connective  -*  .  In  fact,  except  for  a  mathematical  representation  of  the  object  a  4  b, 
Adams'  intention  was  to  provide  a  semantic  evaluation  map  compatible  with  conditional 

A  A 

probability  for  Jf  Basic  connectives  cn  Jf  are  defined  as  follows  (sec  Chapters  1, 4). 

(<2  4  by  -  {a  4  b ') , 


(a  4  b )  A  (c  4  d)  =  ( a  h  c  4  (a  -*  b){c  -  d)) , 


(a  4  b)  V  (c  4  d)  =  {a  V  d  4  ab  V  cd) . 


As  before,  a  probability  model  is  a  probability  measure  P  on  Jf.  The  associated 
"truth  conditional  semantics"  for  Jf  is  defined  by 


£  :  Jf-*  [0,  J J,  P{a  4  b)  =  P(b\a) . 

The  set  of  conditional  formulas  {{a-  4  b-),  i  =  1, ....  n]  is  said  to  entail  the  conditional 
formula  c  4  d  if  and  only  if  for  all  £  >  0,  there  is  5  >  0  such  that  for  all  P  on  Jf  for 
which  P(c:)  >  0,  i  =  /, ....  n  and  P(c)  >  0),  if  P(b-\a-)  >  /  -  5,  for  i  -  /( ....  n.  then 
P(d\c}  >,  1  -  e.  This  concept  of  cntailment  in  Adams'  conditional  probability  logic  is 
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suitable  for  default  reasoning  in  AI  in  which  £  is  a  collection  of  propositions  and  Jf  is 
a  set  of  default  statements.  Indeed,  for  a  default  statement  a$b=-  "almost  all  a's  are 
b's ,"  one  translates  "almost  all"  into  "PQ)\a)  is  arbitrary  close  to  1,  short  of  actually 
being  1"  (Pearl,  1988,  p.  480);  moreover,  the  set  of  defaults 

{(a.  *  bf),  i  =  1, ....  n}  '  - 

logically  entails,  (c  ->  d)  if  P(b.\a.)  is  "high,"  i  =  1, ...,  n,  then  P{d\c)  is  also  "high." 
For  more  details  on  this  "e-semantics,"  we  refer  again  the  reader  to  Pearl’s  book  (1988). 

6.2  Syntax  and  basic  properties. 

Let  the  Boolean  ring  R  be  the  base  space  for  classical  two-valued  logic  (also  for 

probability  logic).  The  base  space  for  the  conditional  probability  logic  (CPL)  we  are 

going  to  develop  is  the  mathematical  conditional  extension  R\R  with  its  algebraic 

structure  established  in  Chapters  2,  3  and  4.  Now  elements  of  that  is,  "conditional 

formulas,"  are  mathematical  entities,  and  we  can  describe  special  elements  of  as  in 

the  case  of  R.  Specifically,  we  are  going  to  describe  syntactically  "contradictions"  and 

tautologies  on  in  a  manner  compatible  with  truth  conditional  semantics  in  Section 

6.3.  By  the  same  token,  various  characterizations  of  implicative  relations  in  CPL  are 

given,  generalizing  those  of  material  implication  in  classical  logic. 

First,  in  probability  logic,  an  element  a  e  R  is  called  a  "contradiction"  or  a 

"P-tautology"  according  to  P(a)  =  0  or  P(a)  =  1  for  all  probability  measures  P  on  R. 

By  Lemma  1  of  Section  2.2,  these  are  equivalent  to  a  =  0  or  a  -  1  .  The  counterparts 

of  0  and  1  on  R\R  are  now  described.  Observe  that  R\R  =  v  R\Rb'  where  each 

beR 

R\Rb'  is  a  Boolean  ring  with  its  contradiction  and  tautology  ( 0\b ),  ( l\b ),  respectively, 
provided  b#0.  Thus, 

Definition  1.  The  classes  of  zero-type  conditionals  and  unity-type  conditionals  are, 
respectively 

sr=  {(0|6),6e/?\{0}} 

and 

{(/|*),*eJN0)}  • 

In  the  Section  6.3,  we  will  show  that  these  concepts  are  compatible  with  truth  conditional 
semantics  on  R  |tf.  The  following  theorem  summarizes  basic  properties  of  %  and  J! 
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Theorem  1. 

(0  Both  %  and  U  are  closed  under  •  and  V  on  R\R, 

(ii)  %  (resp.  U)  has  the  ideal-like  (resp.  filter -like)  property:  (R\R)-  Z-  % 

(resp.  (R\R)  V  %-  &), 

(iii)  Z  u  { (0 1 0) }  is  closed  under  •,  V,  +  on  R\R, 

(iv)  Z  and  U  are  "complementary"  in  the  sense  that 

Z=  {(b\b)'  :(b\b)z  U)  ,  and 
jr={{0\bY  :(0\b)e  Z)  . 

Proof,  (i),  (iii),  and  (iv)  are  obvious  from  the  definitions  of  the  operations  on  R\R. 

Since 

(a\b)-(0\c)  =  (0\a'bv  c)e  Z, 

(l\l)(0\b)  =  (0\b), 

(a\b)  V  (c|c)  =  ( ab\b )  V  (c|c)  =  ( ab  V  c\ab  V  c)  e  U, 
and 

<P\0)VQ>\b)  =  {b\b), 

part  (ii)  holds.  o 

It  is  known  in  classical  logic  that  the  material  implication  b  -*  a  is  a  tautology  (that 
is,  b-*a-  1)  if  and  only  if  b  <  a  (that  is,  b  "strictly"  implies  a).  This  fact  is  a 
characterization  of  the  binary  Boolean  operator  -» .  In  other  words,  -*  is  tire  only  binary 

Boolean  operator  on  R  having  this  property.  Indeed,  it  is  obvious  that  if  / ;  R2  ->  R, 

f(a,  b)  =  b  -» a  -  b'  V  a,  then  f(a,  b)  -  1  whenever  <  a.  Conversely,  if  /:  R?  -» R  is 
such  that 


then 


f(a,  b)  =  1  if  and  only  if  b  <  a, 


f(l,  1)  =/(/,  0 )  =/(0,  0)  =  1, 

and  f(0, 1 )  =  0,  so  that  the  normal  disjunctive  form  of  f(a,  b)  reduces  to 
f(a,  b)  -  abV  ab'  V  a'b'  =  a  V  a'b'  =  a  V  b'  . 


The  situation  in  conditional  logic  is  somewhat  different  in  the  sense  that  there  are 
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various  conditional  Boolean  polynomials  in  two  variables  satisfying  the  counter-part  of  the 
equivalence  betweeen  strict  implication  in  two-valued  logic  and  being  a  tautology. 

Sp« _ ifically,  strict  implication  in  two-valued  logic  is  replaced  by  the  order  relation  on 

R\R  (see  Chapter  3),  and  tautologies  in  conditional  logic  are  elements  of  %  that  is  of  the 
form  ( b  |  b)  with  b*0. 

We  are  going  to  characterize  conditional  Boolean  polynomials  /  in  two  variables 
satisfying  the  following  equivalent  condition.  For  any  a,b,c,deR  with  b,  d*0 

Ma\b),(c\d))e  V 

if  and  only  if 

(a\b)£(c\d) 

We  may  assume  without  loss  of  generality  that  /  is  of  the  form 


/=  (a\p)  =  (a|a  v  % 

A 

where  a,  y :  R  -» R  are  Boolean  functions,  and  ay  =  0.  Theorems  2  and  3  below  shed 
light  not  only  on  conditional  logical  operations  taking  values  in  1C,  but  also  are  needed 
in  proving  that  CPL  is  sound  and  complete  (Section  6.4). 


2 

Theorem  2.  Let  f :  (R\R)  -*R\R  be  a  conditional  Boolean  polynomial  in  two  variables. 
The  following  are  equivalent. 

(0  For  a,  b,  c,  d  e  R,  with  b,  d±0, 

f((a  |  b),  (c  |  df)  g  %  if  and  only  if  (c\d)<(a\  b). 

(//)  /  is  of  the  formf  =  (a|  j3)  =  (a|  a  V  7]),  where 

7](  a,  b,c,d)  =  (ab)'(cd)  V  {a'b){c'd)' , 

and  a  is  a  Boolean  function  such  that  a<t]',  and  a*  0  when  rj  =  0. 

Proof.  To  prove  that  (i)  implies  (ii),  we  use  the  criterion  that  (c  |  d)  <  (a  |  b)  if  and 
only  if  cd  <  ab  and  a' b  <c'd.  This  is  the  same  as 

( cd)(ab )'  =0  =  ( a'b){c'd )' , 


(ab)'(cd)  V  (a'b)(c'd)'  =0. 

Thus  ri  =  ri(a,  b,  c,  d)  =  (ab)'(cd)  V  (a'bXc'd)' =  0  if  and  only  if  (c\d)<(a\b).  For 
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(i)  to  hold,  /  =  (a  |  a  V  f)  has  to  be  such  that  y  =  TJ,  since  otherwise  it  is  possible  that 
simultaneously  j(a,  b,  c,d)  =  0,  a(a,  b,c,d)*  0 ,  and  t](a,  b,c,d)  *  0 ,  contradicting  (i). 
That  (ii)  implies  (i)  is  easy.  o 


The  precise  forms  as  well  as  the  total  number  of  fs  in  (ii)  can  be  determined  as 
follows.  Let 


(ab  if  i  =  1 
wi(al6)  =  |  a'biJi^O 
i  b'  if  i  =  u. 

We  have 

7](a,  b,c,d)  =  ( ab)'(cd)  V  (a'b)(c'd)' 

=  a' bed  V  b'cd  V  a'bd' 

=  woCal^w^cId)  V  wu(a\b)w\(c\d)  V  w0(a  |  b)  wu(c  |  d) 

=  V(/J)6/Wi(al^vvj(cld)» 

where  J  =  {(0,7),  (u,7),  (0,w)}.  Thus  a(a,  b,  c,  d)  must  be  of  the  form 

V(zj>/f  Wi(al6)vvj(ci^’ 

where 


if  =  {(0,0),  (u,0),  (7,0),  (7,7),  (7 ,u),  («,«)). 


As  examples,  for 


K  ={(0,0),  (m,0),  (7,0),  (7,7),  (7,ii)}, 
a  =  a'bc'd  V  abe'd  V  abed  V  add'  V  b'c'd  =  ab  V  c'd. 

When  7]  =  0, 


7]'  =abvc'dvb'd'  =1. 


But  for  b,  d  # 0 ,  b'd'  <  7,  so  that  ab  V  c'd  &  0.  Here,  ab  V  c'd  V  7]  =  b  V  d.  Thus  /  is 
of  the  form 


For 


/((ajb),  (c  |  d))  =  (ab  V  c'd  j  b  V  d). 
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K=  (0,0),  (u,0),  (1,0),  (1,1),  (l,u),  (u,u)}, 

a  =  abV  c'd  V  b'd',  which  is  not  0  when  7]  =  0.  In  fact  a  =  1  when  r\  =  0.  Thus 

Ma | b),  (c\d))  =  (abv  c'd  \l  b'd'  \  1) 

since  for  all  a,  b,c,de  R 

ab  V  c'd  V  b'd'  Vr\-1 
These  two  forms  have  interesting  interpretations.  The  last, 

Ma\b),  (c\d))  =  (ab  V  c'd  V  b'd'\l)), 

is  the  consequent  of  Lukasiewicz's  implication  (see  Section  3.4),  where  the  consequent  of 
a  conditional  (a\b)  is  defined  to  be  C(a\b)  =  ab. 

Using  Theorem  3,  Section  3.4,  it  can  be  checked  that  the  first  form 

f2((a\b),  (c\d))  =  (ab  V  c'd\b  V  d) 

corresponds  to  Sobocinski's  truth  table  for  implication.  This  truth  table  is  given  in 
Rescher  (1969,  p.  70)  with  the  sign  +  (respectively  -)  in  front  of  the  truth  valuess  to 
indicate  "designated"  (respectively,  "anti-designated")  values  for  consideration  of 
tautologies  (respectively,  contradictions)  in  multi-valued  logic.  We  will  discuss  this 
further  in  Section  6.3.  Adams,  Calabrese,  and  one  of  Schay's  conditional  disjunctions  Vo 
are  all  defined  to  be 

(a\b)  Vq  (c\d)  =  (ab  V  cd\b  Vd). 


Thus 

M(.a\b),(c\d))  =  (a\b)v0(c\d)'. 

We  will  return  to  this  observation  in  Section  6.4. 

Not  all  subsets  K  of 


{(0,0),  (u,0),  (1,0),  (1,1),  (l,u),  (n,u)} 
lead  to  0's  satisfying  condition  (ii)  of  Theorem  2.  For  example,  if 

K  =  {(1,0),  (1,1),  (l,u),  (u,u)), 

then 


a  =  abd'  V  abed  V  abe'd  V  b'd'  =abv  b'd'. 
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Taking  a  =  c  =  0  and  b  =  d=  1,  we  have, 

ab'  V  c'd  V  b'd'  =  1, 

so  that  7 1(0, 1,  0,  2)  =  0.  But  a(0,  2,  0 ,  2)  =  0.  Thus  a(a,  b,  c,  d)  -  ab  V  b'd'  does  not 
satisfy  our  condition  (ii). 

We  now  look  closer  at  fa  and  fa.  First,  since  fa  =  (cc  j  a  V  7])  with  a  =  7]',  we 
see  that  for  all  a,  b,c,dz  R,fa  satisfies 

fa(a\b),  (c\d))  e  U  if  and  only  if  (c|d)<(a|b).  (*) 

Now  fa  does  not  satisfy  (*).  Indeed,  when  7]  =  0,  we  have  abc'd  =  b  V  d.  This 

equality  holds  also  when  b  =  d  =  0,  but  then  fa((a\0),  (c 1 0))  t  U.  However,  fa  satisfies 

f(,(fl\b),(c\d))  e  U  if  and  only  if  (c\d)<(a\b)  and  b  or  d^O  (**). 

Indeed,  when  (c|d)  <  (a\b),  we  have  ab  V  c'd  =  bVd.  If  b  or  d  is  *0,  then  ab  V  c'd 

=  b  V  d*  0,  and  hence  fa((a  |  b),  (c|d))  e  U.  Conversely,  if  fa((a  |  b),  (c|d))  e  U,  then 
ab  V  c'd  =  b  V  d&O,  implying  that  7]  =  0  and  b  or  d  *  0.  On  the  other  hand  fa  does 
not  satisfy  (**),  since  fa((a\0),  (c|0))  €  U. 

It  turns  out  that  (*)  and  (**)  characterize  fa  and  fa ,  respectively.  Consider  first 
the  condition  (*).  As  before, /=  (a|  a  V  7]),  where 

a(a,  b,  c,  d)  =  Wi(a\b)Wj(c\d), 

with 


K c  {(0,0),  («,0),  (2,0),  (2,2),  (l,u),  (n,«)}. 

4 

We  are  going  to  show  that  if  2^  is  a  strict  subset,  then  there  is  an  (a,  b-t  c,d)  e  R  such 
that  ct(a,  b,  c,  d)  =  0.  We  only  have  to  look  at  subsets  of  ab  V  c'd  V  b'd'  =  7]'.  We 
already  know  from  above  that  if  a  is  abVc'd  or  abVb'd'  or  c'd  Vb',  then  /  will 
not  satisfy  (*).  Thus  it  suffices  to  consider  subsets  of  the  form  ab  V  c' d  V  xyb' d'  or 
ab  V  xyc'd  V  b'd'  or  ".xyob  V  c'd  V  h'd',  where  x  and  y  can  be  one  of  a,  b,  c,  d,  or 
their  complements.  For  example,  in  the  case  ab  V  c'd  V  xyb' d'  where  xyb'd'  &  b'd', 
then  when  7]  =  0,  we  have  ab  V  c'd  V  xyb'd'  =  (fay)' b' d')' ,  and  it  is  easy  to  pick  x,  y, 
b,  d  so  that  ( xy)'b'd '  =  2.  The  other  details  are  left  to  the  reader. 

Consider  now  the  condition  (**).  For  (*)  to  hold,/  =  (a|  a  V  7]),  where 

a(a,  b,  c,  d)  =  wi(alh)wj(c|d), 

with 
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Kq  {(0,0),  (u,0),  (1,0),  (1,1),  (l,u),  («,«)}• 

We  are  going  to  specify  K  so  that  (**)  holds.  We  need  to  pick  a  so  that  a  >  0  is 
equivalent  to  b  or  d  being  >  0.  Now  b  or  d  >  0  if  and  only  if  bS  d>0.  Using  the 
decomposition  of  bVd  in  terms  of  the  Wi(a|h)wj(c|d)'s,  we  see  that  b  V  d  if  and  only 
if  wfa | b)wfc | d)  >  0  for  some  (i,J)  *  (u,  u).  But  when  (c\d)  <  (a\b),  that  is,  when 
7 1  =  0,  we  have  wfa^w-^d)  =  0  for  all  (i,  f)  e  {(0,  1),  (u,  1),  (0,  «)}.  Thus 
cda,  b,  c,  d)  >  0  when  rj  =  0  and  b  or  d&O  only  for 

Ko  ((0,0),  (u,0),  (1,0),  (1,1),  (l,u)). 

But  the  upper  bound  of  K  is  {(0,0),  (u,0),  (1,0),  (1,1),  (l,u),  (u,u)},  and  it  leads  to  flt 
which  does  not  satisfy  (**).  Hence  K  must  be  {(0,0),  (u,0),  (1,0),  (1,1),  (l,u)},  which 
yields  /2. 

In  classical  two-valued  logic,  the  equivalence  realation  «— »  defined  by 

a  • — » b  —  (a  b)  A  (b  ■*  a) 

is  characterized  as  the  only  binary  Boolean  operation  /  such  that 

f(a,  b)  =  1  if  and  only  if  a  =  b. 

Using  the  definition  a  -»  b  =  a'  V  b,  this  is  routine  to  check.  The  counterpart  in 
conditional  logic  is  expressed  in  the  following  theorem. 

2 

Theorem  3.  Let  f :  (R\R)  be  a  conditional  Boolean  polynomial  in  two  variables. 

The  following  are  equivalent. 

(i)  For  any  a,  b,  c,  d  e  R  with  b,  d^O, 

f((a\b),  (c\dj)  e  U  if  and  only  if  (a\b)  =  (c\d), 

(ii)  f  is  of  the  form  (a\cc\  %),  where 

Z(a,b,c,d)  =  (ab  +  cd)  V  (b  +  d), 
a  is  Boolean,  a  <  and  a^O,  when  £  =  0. 

Proof.  First,  (a\b)  =  (c\d)  if  and  only  if  ab  -  cd  and  b  =  d.  This  is  the  same  as  ab  + 
cd  =  b  +  d  =  0,  or  (ab  +  cd)  V  ( b  +  d)  =  0.  Let 


<5  =  £(a,b,c,d)  =  (ab  +  cd)  V  ( b  +  d). 
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Then 

^  =  ab'cd  V  abd'  V  a’ bed  V  b'cd  V  a'bd'  V  &'c'd 

=v(ij>Wfli6)w/ci4>’ 

where 

J=  iiujO).  (1,0),  (u,l),  (l,u),  (0,1),  (i 0,u )}. 

Now,  that  (ii)  implies  (i)  is  obvious,  and  (i)  implies  (ii)  since  /=  (a\a  V  j)  will  satisfy 
(i)  if  r=£  o 

The  specific  form  of  a  is 

a-  v  w.(tf|&)u^(c|<i), 

where 

Again,  not  all  subsets  /  lead  to  an  a  satisfying  (ii).  Two  interesting  candidates  are 

I  =  {(0,0),  (1,1),  (u,u)}, 
and 

I  ={(1,1),  (0,0)). 

For  the  first, 


a  =  a'bc'd  V  abed  V  b'd'. 


Here,  p=av^  =  q'v^  =  l,so  that 

fs((a\b),  (c|d))  =  abed  V  a'bc'd  V  b'd' 

When  £  =  0,  a  =  1,  and  hence  £  0. 

For  /  =  {(1,1),  (0,0)},  a  =  abed  V  a'bc'd.  When  q  =  0, 

q'  =  1  =  abed  V  a'bc'd  V  b'd'. 


But  for  b?  0,  d?  0,  we  have  b'd'  <  /,  so  that  abed  V  a'bc'd  *  0  for  all  a,  c  e  /?. 
Here  j3  =  a  V  y=  b  V  d,  and 


/4((a|&),  (c|d))  =  (abed  V  a'&c'djfr  V  d). 
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The  conditional  polynomial  fa  corresponds  to  the  consequent  of  Lukasiewicz's 
three-valued  logic  equivalence,  while  fa  is  the  syntax  of  Sobocinski’s  three-valued  logic 
equivalence.  See  Chapter  3  for  more  details. 

It  turns  out  that  fa  and  fa  are  the  only  candidates.  Indeed,  for  other  I  £  {{0,0), 
{1,1),  {u,u)},  one  can  find  a,  b,  c,  d,  with  b  *0#  d,  such  that  a{a,  b,  c,  d)  =0  when 
bfa,  b,  c,d)  =  0.  For  example,  if  /  =  {(1,7),  {u,u)},  then  a  =  abed  V  b'd'.  Taking  a  =  c 
=  0  and  b  =  d  =  1,  we  get  EfO, 1, 0, 1)  =  0,  and  a{0, 1, 0 ,  7)  =  0. 

As  a  final  note,/2  and  fa  satisfy 

f{{a\b),{a\b))  =  {b\b). 


6.3  Truth  conditional  semantics 

This  section  consists  of  extending  the  basics  of  classical  two-valued  logic  (C2)  and 
Probability  Logic  (PL)  to  Conditional  Logic  (CL)  and  Conditional  Probability  Logic 
(CPL).  By  Conditional  Logic,  we  mean  Lukasiewicz's  three-valued  logic  on  the 
conditional  space  72|72,  where  7?  is  a  Boolean  ring,  or  equivalently,  72(72  equipped  with 
logical  operations  developed  in  Chapters  3  and  4.  By  Conditional  Probability  Logic,  we 
mean  a  multi-valued  logic  with  base  space  72(72  on  which  truth-values  are  conditional 
probabilities.  The  base  space  of  CL  is  a  (special)  Stone  algebra  72 (72  (see  Chapter  4). 
Similarly,  the  truth-space  of  CL  is  the  Stone  algebra  [0,  u,  7},  with  0  <  u  <  1,  with  the 
following  operators.  (See  Section  3.4  for  the  appearance  of  { 0 ,  u,  1)  as  the  truth  space 
for  72  1 72.)  We  use  the  same  notation  ',  A  ,  and  V  on  72|72.  In  view  of  Lukasiewicz's 
truth  tables,  for  i,j  e  { 0 ,  u,  1),  we  have 

i  A  j  =  min{i,j),  i  V;  =  max[i,j)  , 

0'  =  7,  1'  =  0,  u'  =  u. 

*  *  * 

The  pseudo-complementation  on  {0,  u,  1)  is  0  =l,u  =0,1  =  0,  which  does  satisfy 

Stone's  identity  i  V  /  *  =  7,  Vi  e  [0,  u,  1). 

First,  we  formulate  the  concept  of  a  model  in  CL. 

Definition  1.  A  model  in  CL  is  a  homomorphism  from  72|72  to  [0,  u,  1},  that  is,  a  map 
preserving  the  operators  '  ,  A,  and  V. 

It  turns  out  that  models  in  CL  can  be  built  from  those  in  Cr  Specifically, 
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Theorem  1. 

(z)  //rneH  is  a  maximal  filter  of  R,  then  the  map  h^ :  R\R (0,  u,  1}  defined 
by 


V°i« = 


'  1  if  ab  e  co 
•  0  if  a'b  e  at 
u  if  b'  e  co 


is  a  homomorphism. 

(if)  If  h  :  R\R  -»  (0,  li,  I }  is  a  homomorphism,  then  there  is  an  cos  Q  such  that 
h  =  hQf 

Proof.  Note  that  since  (ab,  a'b,  b')  forms  a  partition  of  1,  and  mis  a  maximal 

filter  of  R,  h  is  well-defined.  For  the  proof  of  (i)  we  have 
co 


(1  if  a’b  e  <o 

hj(a\b)'j  =  hja’ \b)  =  (0  if  ab  e  co 

{ u  if  b'  £  (0  . 


In  view  of  Lukasiewicz's  negation  on  { 0,u,l }  (see  Section  3.5),  we  get 


hJffl\b)Y  =  (h^b))'. 

By  DeMorgan’s  laws  on  R\R  (Theorem  3,  Section  4.1)  and  the  fact  that 
it  remains  only  to  show  that  for  (ajb),  (c|d)  e  R[R, 

(*)  hjia | b ),  (c\d)}  =  hj.a | bfii^c \ d)  . 

Now 

(a\b)(c\d)  =  (ac\a'b  V  c'd  V  bd ). 


Since 


ac(a'b  V  c'd  V  bd)  =  abed. 


is  involutive. 


and  abed  e  co  if  and  only  if  abe  co  and  cd  e  co,  (*)  is  true  for  the  value  1. 
For  the  value  G,  we  have 


(ac)'(a'b  V  c'd  Vbd)  =  a'b  V  c'd  a  co 

if  and  only  if  cither  a'b  or  c'd  (or  both)  e  co.  Thus,  in  view  of  Lukasiewicz's 
conjunction  on  { O.u.l }  (see  Section  3.5),  (*)  is  true. 

Finally,  the  following  arc  equivalent: 
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(1)  h(0Ka\b){c\d)]  =  u, 

(2)  a'b  V  c'd  Vbdt  to, 

(3)  a'b,c'd,bde  (o, 

(4)  aVb',cVd',b'Vd'eco, 

(5)  ( a  or  b'  e  co)  and  (c  or  d'  e  CD). 

The  case  "only  a  and  c  are  in  co"  is  excluded,  since  then  abed  e  CD, 
contradicting  the  condition  [(a\b)(c\d$\  =  u.  All  remaining  cases  correspond  to 
^co  W»(d  (c\£0  =  u-  F°r  example,  if  only  a  and  d'  are  in  CD,  then  abe  CD,  d'  e  CD, 
and  hence  (a\ b)ha  (c\d)  =  l-u  =  u.  ' 

To  prove  (ii),  let  h  : R\R -*  [0,  u,  1}  be  a  homomoiphism.  Let  g  be  the 
restiiction  of  h  to  R  viewing  R  as  i?|7.  This  restriction  g  can  only  take  values  in 
[0,1).  Indeed  if  there  is  an  a  e  R  such  that  g(a)  =  u,  then  since  g  is  obviously  a 
homomoiphism  from  R  to  [0,  u,  1),  we  have 

0  =  g(0)  =  g(aa')  =  g(a)g(a')  =  fg(a))[g(a)}'  =  uu'  =  uu  =  u, 

which  is  impossible.  Thus  g  is  a  Boolean  homomoiphism  between  R  and  [0,1),  and 
hence  is  the  indicator  function  I  a  of  some  maximal  filter  co  of  R. 

It  remains  to  show  that  h  =  h„.  Observe  that 

CD 

(a\b)  =  (ab\l)V[(b'\l)(0\0)] 
and 

(0|0)'  =(7|O)=(O|0). 

which  implies  that  h(0\0)  =  h(0\0)'  =  [h(0\0)Y  -  u  since  u  is  the  unique  element  in 
[0,  U,  /}  such  that  u'  =  u.  Thus 

h(o\b)  =  l(0(ab)Vl(0(b,)-u. 

From  this,  since  ( b')u<  u,  h(a\b)  =  1  if  and  only  if  1  (ab)  =  1,  if  and  only  if 
h(0ia\b)  =  1  .  Next,  h(a\b)  =  0  if  and  only  if  (ab)  =  1&  (b')  =  0,  if  and  only  if 
*co  ®  =  I*  ^  ^  only  h<o  =  ®  •  Finely,  h(a\b)  =  u  if  and  only  if  /  (ab)  = 
0  and  (b')  =  1,  if  and  only  if  11^(0^)  =  ^  □ 

In  view  of  the  Theorem  above,  models  in  CL  are  precisely 

Wc  investigate  now  two  possible  counterpans  of  maximal  filters  in  the  case  of  Stone 

algebras.  First,  consider  h~J(l).  Since  (0\0)  -(0\0Y  does  not  belong  to  any  h^J(l), 

the  characterization  of  maximally  of  (Boolean)  filters  docs  not  hold  for  h~\l). 
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However,  other  properties  of  0)  remain  valid  for  In  particular,  each  set  h~J(l) 

is  a  filter  in  the  lattice  (R\R,  A,  V).  That  is,  if  (a\b),  (c|d)  e  h'hl),  then 

w 

(a\b)Mc\d)eh-J(l), 

and  if  ( a\b )  6  (1)  and  (c\d)  e  R\R,  then 

(a\b)V(c\d)eh-J{l). 

In  fact,  since  h^: R\R  ->  [0tu,  1}  is  a  homomorphism,  each  (I)  satisfies  the 
following  stronger  conditions: 

(1)  {a\b),{p\d)eh~J{Jl)  if  and  only  if  (a\b)  h  (c\d)  e  h'J (1), 

(2)  (a\b)  V  (c|d)  e  h'J (I)  if  and  only  if  (a\b)  e  A^(I)  or  (cjd)  e  and 

(3)  (7  ji)  e  h'J(l),  (p\J)  e  ^(7),  (0|0)  e  A^(7). 

Moreover,  the  class  ^  of  filters  of  R\R  satisfying  (1),  (2)  and  (3)  are  the 
h-J(I).  ae  Q.  To  see  this,  let  AcR\R  satisfy  (1),  (2)  and  (3),  and  set  (o  =  Ac\R, 
where  R  is  identified  with  R\I .  ©  is  obviously  a  filter  in  R.  Moreover,  for  a  e  R, 
either  (a|7)  or  (a'|7)  is  in  A,  since  otherwise, 

(a\l)V(a'\l)  =  V\l) 

will  not  be  in  A,  by  (2),  a  contradiction.  Thus  o  is  maximal.  It  remains  to  verify  that 
A  =  If  (a|b)  s  k  that  is,  ab  e  (0  =  A  nR,  then  {ab\l)  g  A.  But 

(«|b)  A  (ab\l)  =  (ab\l),  so  that  (a| b)zA  by  (1).  Conversely,  if  (a J b)  e  A,  then  write 
(ajb)  =  ab  V  (o'-(0|0))  £  A.  By  (2),  we  have  abeA  or  b'  {0\0)&  A.  But 
b’  -(0|0)  6  A  holds  only  if  b'  e  A  and  (0|0)  e  A,  by  (1).  However,  by  (3),  (0|0)  £  A, 
thus  only  ab  e  A  holds,  that  is,  ab  e  o>,  so  that  ^(ajb)  =  7.  □ 

Consider  now  ©efl  Since  h^tRlR-i  {0,  u,  1)  is  a 

homomorphism,  the  following  facts  arc  easy  to  derive: 

(i)  l })  a  R  =  (o,  a  maximal  filter  of  R. 

(ii)  for  (ajh)  €  (R\R),  (o|»)  6  h~J{{ut  /})  or  (a|/>)'  €  7}),  or  both. 
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(iii) If  (a\b)  e  1}),  then  for  (c\d)  e  (R\R),  (a\b)  V  (c\d)  e  i}). 

(iv)  (a\b)  A  (c\d)  e  h'J([u,  1})  if  and  only  if  both  (a\b)  and  ( c\d)  are  in 
h'jduj)). 

(v)  (a\b)  V  (c\d)  e  i})  if  and  only  if  (a\b)  e  7»  or 

(c\d)eh‘J«u,l)). 

(vi)  for  b-zRAimeh-Jau,!))- 

(vii)  (a\b)  e  h~J(\u,  1})  if  and  only  if  b-*a  =  b'Vaeco. 

As  in  the  case  of  h~J(l),  the  class  ^  of  filters  of  R\R  satisfying  (i)-(vii)  above 
is  precisely  [h’J{{u,  1}),  CO  e  Q}. 

Remark.  Since  (0|7)  i  h^((u,  7}),  the  filter  ({«,/})  is  proper.  If  AcR|R  is  a 

filter  satisfying  (i)-(vii),  and  h~J({u ,  1})  cA,  then  a})  =  A,  that  is,  h~^{[u,  1 }) 

is  "maximal."  Indeed,  we  have  cocAc\R  =  a  maximal  filter  of  R  by  (i).  But  then 
£ o-AnR .  From  (iii)  and  the  above  observation,  if  (a\b)  e  A  then  (b  a)  e  to, 

implying  that  (a\b)  6  h’Jdu,  1)). 

We  specify  now  basic  semantic  concepts  of  CL.  Recall  again  that  Q.  is  the  class  of 
models  (maximal  filters)  of  R.  In  order  to  define  the  concept  of  tautologies  in  terms  of 
models  of  R|R,  we  need  to  specify  the  class  of  "designated  truth  values"  (Rescher,  1969, 
p.  66-71).  Indeed,  as  in  any  multi-valued  logic,  among  the  truth-values  0,  u,  1,  we  have 
to  classify  (or  designate)  certain  of  these  values  as  "truth-like"  values  (for  the  concept  of 
contradictions ,  the  dual  concept  is  "false-like"  values  or  "antidesignated"  values).  Thus,  if 
1  is  the  only  designated  value,  then  a\b  e  R\R  is  a  tautology  if  it  is  "true"  in  all  models 

of  R\R,  that  is,  for  cog  Q,  (a\b)  e  h~J{l) .  It  is  clear  that  (I  |I)  is  the  only  tautology 

in  this  sense.  Indeed,  (a\b)  €  h'J(l)  if  and  only  if  ab  e  co .  Thus  if  0  <  ab  <  1,  then 
( ab )'  *  0 ,  so  that  there  is  some  co  e  Q  such  that  ( ab )'  e  co  and  hence  ab  t  co.  Of 
course,  if  ab  =  1  then  b>ab  implying  that  b  =  i,  and  (a\b)  -  (ab\b)  --  (1\1). 

If  [u,  1)  is  the  set  of  designated  truth-values,  then  ( a\b )  is  a  tautology  if  for 

cue  Q,  ( a\b )  e  h’J({u,  1}) . 

Now  0  <  ab  <  b,  then  a'b^O,  so  that  there  is  some  ye  Q.  such  that  a' b  e  y,  so 
that  (a\b)  e  hy  ([u,  1}).  Thus  ab  =  b  *  0,  that  is,  (a\b)  =  (ab\b)  -  (b\b)  =  (1  \b),  and 
hence,  the  class  of  {t*, /} -tautologies  is  {(l\b),  b  e  R\{0}}  which  is  the  class  of 
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unity-type  conditionals  %C  investigated  in  Section  6.1.  Note  that  for  coeQ, 
h  (0\0)  =  u,  so  that  formally  (0|O)  is  also  a  tautology.  To  exclude  (0 j 0),  one  should 

require  that  (a\b)  is  a  {«,  i} -tautology  if  (a\b)  e  h~J({u,  1})  for  coe  Q,  and  there  is  at 
least  one  ye  Cl  such  that  hy(a\b)  =  1. 

The  concept  of  entailment  relation  in  CL  is  formalized  as  follows.  We  say  that  (a  \  b ) 
logically  entails  (c|d),  in  symbols, 

CL 

( a\b )  H  (c|d), 

if  for  to  e  (c|d)  e  whenever  (a|b)  e  and  (cjrf)  e  h^J({u,  1}) 

whenever  (a\b)  e  h~J{{u,  1)).  Roughly  speaking,  (a\b)  entails  (c\d)  if  the  truth-value 
of  ( c\d)  is  greater  (or  equal)  than  that  of  (a\b).  More  precisely, 

Theorem  2.  The  following  are  equivalent. 

CL 

(0  <fl\b)h  (c\d), 

0*0  for  coe  Q,  h^a  |  b)  <  hjic  |  d) ,  and 
(tii)  (a\b)<(c\d). 

Proof.  That  (i)  and  (ii)  are  equivalent  is  obvious.  To  get  the  equivalence  of  (i)  and 
(iii),  note  that  (a\b)  =  1  if  and  only  if  ab  6  co,  and  k ^  {a\b)  e  { u ,  1}  if  and  only  if 
b'  V  a  e  co  .  This  can  be  rephrased.  For  cue  Q,  ab  e  co  implies  crfe  co,  and  for 
co  e  Q,  b'  Vat  o)  implies  d'  V  c  e  co.  By  Lemma  2  of  Section  6.1,  these  statements  are 
equivalent  to  ab  <  cd  and  b'  V  a  <  d'  V  c,  or  ab<  cd  and  c'd  <  a'b  which  means 
(iii).  (See  Theorem  1,  Section  3.3).  □ 

As  in  the  case  of  C ^  the  logical  entailment  relation  H  in  CL  is  monotone.  This 
follows  readily  from  the  fact  that  h  is  a  homomorphism.  See,  however,  Chapter  8. 

Now  to  Conditional  Probability  Logic  (CPL).  One  of  the  practical  motivation  for 
considering  conditional  probabilities  lies  in  the  construction  of  Bayesian  (causal)  networks 
(for  example,  Lauritzen  and  Spiegelhalter,  1988).  For  quantifying  rules  in  intelligent 
systems  with  other  uncertainty  measures,  see  for  example,  Dubois  and  Prade,  1990.  As  in 
the  case  of  PL,  if  P  is  a  probability  measure  on  R,  then  P(a  |  b)  =  r  means  that  a  is 
"time"  in  100  r%  of  the  "possible  worlds"  in  which  b  is  "true."  A  model  of  CPL  is  an 

A 

extension  P  :  R\R  -*[(),]]  of  a  probability  measure  P  on  R,  defined  by 

P((a\b))  =  P(a\b),  for  P(b)  *  0. 
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As  in  the  case  of  probability  models,  P  has  the  flavor  of  a  "homomorphism-like"  map. 

A 

See  also  the  previous  discussion  concerning  Adams'  e-semantics.  We  write  P  simply  as 
P. 

If  7  is  the  only  designated  truth-value,  then  (<2|h)  e  R\R  is  a  CPL-tautology  if 
P(a  |  b)  =  1  for  all  P  on  R  such  that  P{h)  ?  0.  The  class  of  CPL-tautologies  is 
precisely  that  of  unity-type  conditionals  U  of  Section  6.1.  Indeed,  if  P(a\b)  =  7  for  all 
P,  then  P(ab)  =  P{b),  for  all  P ,  and  hence  ab  =  b  (Lemma  1  of  Section  2.2),  so  that 
(a\b)  =  (ab\b)  =  (b\b)  =  (7  \b).  The  converse  is  obvious.  In  the  same  vein,  P(a\b )  =  0 
for  all  P  if  and  only  if  (a|h)  =  (0|h)e  S’,  the  class  of  zero-type  conditionals  in  Section 
6.1. 

If  {u,  1}  is  the  set  of  designated  truth-values,  then  ( a\b )  is  a  {w,  7} -tautology  if 

(a\b)  g  for  at  least  one  CO  e  Q.  This  class  of  {«,  1} -tautologies  also  coincides 

with  U  Indeed,  let  (1  \b)  e  U,  b*0.  We  have  (1  |h)  e  1}),  for  coe  Q  since 

bb'  =  0.  Next,  since  b  *  0,  there  is  some  ye  Q  such  that  b  e  y  that  is,  (l\b)  e 

Conversely,  let  ( a\b )  be  a  { u ,  7} -tautology.  We  have  a'b  =  0,  that  is,  b<a. 
Hence  (a\b)  =  (ab\b)  =  (Jb\b)  with  b  *  0,  since  by  hypothesis,  there  is  ye  Q.  such  that 
hey. 

The  following  theorem  summarizes  basic  relations  among  all  above  concepts,  the 
proof  of  which  follows  simply  by  inspection. 

Theorem  3. 

(0  The  following  are  equivalent, 
a)  ( a\b )  -  (c\d) , 

p)  for  coe  Q,  hjajh)  =  h^d) , 

i)  for  j  =  3  or  4,  f.{{a\b),  (c\d))  is  a  CPL-tautology  (f^,f4  of  Theorem  3, 
Section  62). 

8)  for  j  -  1  or  2,  fj((a\b),  (c\d))  and  fj((c\d),  (a | b))  are  CPL-tautologies 
( fj,f2  of  Theorem  2,  Section  62). 


07)  The  following  are  equivalent. 

a)  P(a  |  b)  =  P(c  |  d)  for  P  such  that  P(b),  P(d)  *  0, 

P)  for  j  =  3  or  4,  (a\b)  and  ( c\d)  are  CPL-tautologies,  or  (a\b)'  and  ( c\d )' 
are  CPL-tautologies,  or  fj((a  |  b),  ( c  \  d))  is  a  CPL-tautology, 

f)  (a\b)  =  (c\d)  or  (a\b),  (c\d)e  iC  or  (a\b),  (c\d)z  %. 
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(ill)  The  following  are  equivalent, 
a)  (a\b)  <  {c\d) , 

P)  for  cos  Q,  h^a | b)  £  h^c | d), 

f)  for  j  =  1  or  2,  fj{(a\b),  (c|d))  is  a  CPL-tautology. 

(iv)  The  following  are  equivalent. 

a)  P(a  |  b)  <,  P(c  |  d)  for  P  such  that  P(b),  P(d )  *  0, 
p)  (a\b)£(c\d)  or  (a|h)e  %  or  (c\d)  s  U, 

i)  for  j  -  1  or  2,  fj(.(a\b),  (c\d))  or  ( a\b)‘  is  a  CPL-tautology,  or  (c\d)  is  a 
CPL-tautology. 


In  C2,  bY  a  if  and  only  if  b  -*  a  =  b'  V  a  -  1 .  The  counterpart  of  this 
equivalence  in  CL  is  that  (c\d)  H  (a\b)  if  and  only  if  fj((a\b),  (c|d))  or 
j^((g  I Z?),  (c  |  cQ)  is  a  (CPL)-tautology  (that  is,  in  iC).  Note  that  fj  and  ^  play  the  role 
of  material  implication  on  R.  The  equivalence  above  follows  from  Theorem  2  of  Section 
6.2. 

Finally,  the  logical  entailment  relation  in  CPL  is  defined  by 

CPL 

(c\d)  h  (a\b)  if  P(.c\d)  < P(a\b) 


for  P  such  that  P(b)  and  P(d)  0. 

CPL 

In  view  of  Theorem  3,  (iv),  (c  |  d)  h  (a\b)  if  and  only  if 

(1)  (c|d)  <  (a\b),  or 

(2)  cd  =  0,  or 

(3)  b<a . 

We  summarize  the  four  logical  systems  discussed  in  this  chapter. 


Classical  two-valued  Logic  ( C ^). 

Alphabet/Base  space:  R 

Logical  operators  and  relations:  (•)'>  A,  V,  ... 
Equational  axioms:  Axioms  of  Boolean  ring  R 
Truth  space:  {0,1},  designated  value:  1 
Models:  C2  =  (maximal  filters  of  R) 

Tautologies:  1 
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Conditional  Logic  (CL) 

Alphabet/Base:  R\R 

logical  operators  and  relations:  See  Chapters  3  and  4:  (Lukasiewicz's  three-valued 
logic) 

Equational  axioms:  Axioms  of  abstract  conditional  space  (Chapter  4) 

Truth  space:  {0,  u,  1],  designated  values  {«,  2} 

Models:  {homomorphisms  0)  e  £2} 

Tautologies:-  U=  {(2  \b),  b  e  R} 

Probability  Logic  (PL) 

Alphabet/Base  space:  R 

Logical  operators  and  relations:  same  as  C2 

Equational  axioms:  same  as  C2  ■ 

Truth  space:  [ 0 ,  2]  designated  value:  2 
Models:  [P  :  R  [0,  2],  probability  measures} 

Tautologies:  2 

Conditional  Probability  Logic  (CPL) 

Alphabet/Base  space:  R\R 
Logical  operators  and  relations:  same  as  CL 
Equational  axioms:  same  as  CL 
Truth  space:  [ 0 ,  2],  designated  value:  2 
Models:  {extended  P  from  R  to  R\R} 

Tautologies:  %-  {  (2  |h)  .*  b  6  R  \  {0} } 

6.4  Additional  properties  of  CPL 

Although  the  concrete  base  space  R\R  is  sufficient  for  applications,  we  present, 
however,  in  this  section  basic  properties  of  CPL  in  a  more  general  setting.  Recall  from 

Chapter  4  that  the  abstraction  of  R\R  is  an  abstract  conditional  space  L  in  which  its 
* 

skeletal  set  L  plays  the  role  of  R,  and  L  is  isomorphic  to  the  concrete  realization 

*  * 

(L  |  L  ). 

As  usual,  the  logical  structure  of  CPL  (L)  is  given  by  sets  of  rules  ( Rul(L )), 
deducts  ( Ded{L )),  models  ( Mod(L )),  semantic  evaluations  (P(L)  =  all  probabilh„ 

measures  on  L),  tautologies  (  where  U (L)  =  {(b\b),  b  e  L*\{0}}),  and  axioms 
( Ax(L )  =  axioms  of  L  as  an  algebraic  structure,  together  with  a  set  of  logical  connectives 
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f'0'  on  L  \L  ).  Note  that,  as  a  base  space,  (L  j L)  is  a  three-valued  logical  system. 

When  /  ^  is  our  set  of  logical  connectives  developed  in  Chapters  3  and  4,  the 

corresponding  three-valued  logic  is  Lukasiewicz's.  Different  choices  of  /  ^  lead  to 
different  three-valued  logical  systems. 

In  order  to  investigate  deducts  and  tautologies,  it  is  necessary  to  be  able  to  identify 
certain  deducts  as  tautologies  and  conversely.  Specifically,  first  deducts  here  are  of  the 

form  h(a )  =  g(a)  or  h(a)  <  g(a)  where  a  is  any  collection  of  conditional  event 

variables  and  h,  g  are  combinations  of  logical  operators  of  L.  In  view  of  the  remarks 
following  Theorems  2  and  3  of  Section  6.3,  we  can  make  the  following  identifications: 

(h(a)  =  g(a))  t->f£h(a),  g(a)),  i  =  3  or  4, 

C h(a )  <  g(a))  t-iffii (a),  g(a)),  i  =  1  or  2, 

In  the  case  of  /?|K,  we  have, 

f2({a\b),  (c\d))  =  e-«c\d)4(a\b))y 

where  e  =  abv  c'dV  bdV  b'd'  and  4  is  the  extended  material  implication  on  R\R, 
that  is,  (c | d)  4  {a\b)  =  (c\d)'  V  (u|fr).  Note  that  f2  is  Sobocinski’s  ma'erial  implication. 

Using  the  notation  of  Chapter  4,  it  can  be  checked  that,  the  same  situation  holds  in 
the  general  case.  Specifically,  on  L,  we  have 

f2((5,  cx)  =  e- 04  a) 

where  here 


e  =  3*  v  a'*  V  ((a- a')*  «  0- P')*) . 

Thus  f2  is  definable  in  terms  of  the  primitive  operators  of  L.  We  are  now  ready  to 
prove  the  following. 

Theorem  1.  CPL  is  sound  and  complete. 

Proof.  Using  the  above  identifications,  any  deduct  of  L  (in  the  form  of  equality  or 
inequality)  is  a  single  conditional  event.  By  Theorem  1  of  Section  6.3,  its  identification  is 
a  tautology  if  and  only  if  its  corresponding  deduct  represents  a  true  relation  (equality  or 
inequality)  which  is  obvious  here. 

For  completeness,  first  note,  that,  for  azL,  a  =  f2(a,a).  In  particular,  if 
a  e  %(L),  then  f2(&.  cc)  is  a  tautology.  Using  the  identification  a  <  a  m  a  (as  a 
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deduct),  a  is  itself  a  deduct.  □ 

Remarks 

1.  Suppose  that  instead  of  identifying  equational  axioms  and  resulting  deduct  forms 
as  above,  one  replaces  formally  all  axioms  by  the  corresponding  single  conditional  event 
forms  depending  on  the  ./Vs.  Thus  to  completely  axiomatize  all  relevant  expressions, 
avoiding  the  introduction  of  external  equality,  single  conditional  event  forms  must  be 
introduced  as  axioms  characterizing  f2>  f4>  in  part  Therefore,  one  can  ask  whether  it  is 
true  that  the  added  axioms  in  combination  with  new  Rul(L)  would  yield  Ded(L )  as 
interpreted  in  the  above  identification  from  the  equational  axiom  approach.  Here,  the 
added  axioms  are 


(for  all  a,  p-e  L )  (f4<f2( P.  a),  e-( P  =»  a)) , 

(for  all  a,  (5  e  L)  p),f?(cc,  p)-f2(P,  a))) 

and  new  Rul(L)  is  given  by  using  f2>f4  analogously  as  the  derived  inequality  (partial 
ordering)  <  and  equality  =  : 

For  all  a,  j3,  ye  L, 

If  fj(d,  p),  fj({5,  y)  are  deducts,  then  so  is  f.{a,  y),  j  =  2  or  4; 
fj(a,  a)  is  always  a  deduct,  j  -  2  or  4  (thus  can  also  be  an  axiom). 

If  f4(a,  p)  is  a  deduct,  then  so  is  f4Q 3,  a). 

In  a  related  vein,  Rescher  (1969,  p.  66-67)  discusses  changing  Lukasiewicz 
from  a  one  designated  truth  value  logic  [u,  1},  making  a  significant  enlargement  of  the 
class  of  possible  tautologies  for  the  logic.  Rescher  states  that  the  axiomatization  of  this 
new  logic  is  an  open  issue. 

In  view  of  our  results  in  Chapter  3  and  6.2,  togeth with  me  identifications  in  L, 
and  with  augmented  with  Slupecki's  T-operator  (Rescher,  1969,  p.  163),  and  L 

augmented  with  }  ,  etc.,  necessarily  with  two  designated  truth  values,  the  Theorem  1  in 
this  section  seems  to  point  to  the  possible  axiomatization  of  the  logic  Rescher  considers 
via  the  structure  of  L.  But,  more  work  is  needed  on  this. 

2.  Abstract  conditional  spaces  appear  to  be  related  to  "implicative"  algebras  in 
general  (for  example,  Rasiowa,  1974).  From  an  examination  of  the  axioms  describing 
them,  the  more  specialized  pseudo-boolean  or  quasi-pseudo-boolean  algebras  may  also  be 
related.  The  relations  need  to  be  also  explore  d  to  determine  any  mutual  benefit  of  results. 
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3.  By  comparison  of  truth  values  of  various  conditional  operators  (Chapter  3)  with 
Sobocinski’s  truth  tables  given  in  Rescher  (1969,  p.  70),  it  follows  that:  (see  also  Dubois 
and  Prade,  1989)  Sobocinski's  logic  with  defeated  values  {u,  1}  coincides  with  the 
choice  of  negation,  conjunction  and  dis'uuvh  >n«  as  Schay-Adams-Calabrese  have 
independently  done.  Unlike  Lukasiewicz's  .p;-  operator,  Sobocinski’s  implication 
is  a  material  implication  formed  out  of  tin:,  p' . if  y  (negation)  and  yv  (Schay), 
that  is,  j-*i  has  truth  table  given  by  \  ,,(.W^y(j)>i),  i,  je  {0,u,l}.  See  also 
Sobocinski's  original  work  (Sobocinski,  19 $7“<  here  it  is  shown,  as  an  alternative  to 
Wajsberg's  well-known  full  axiomatization  X^  —  or  the  associated  expanded 
axiomatization  for  Siupecki's  extension  of  X  via  his  T  operator,  corresponding  to  the 
special  element  uq  (or  (0|0)  in  the  concrete  case  of  R),  see  Rescher  (1969,  p.  155), 
that  Sobocinski’s  system  can  be  fully  axiomatized.  Furthermore,  as  a  justification  for  the 
Sobocinski's  approach,  by  an  analogous  extension  as  Siupecki's,  the  resulting  logic  is  seen 
to  also  truth  functionally  operator-complete,  being  the  only  other  known  such  system. 
(See  also  Rose  (1953),  Schalz  (1959)).  Specifically,  note  that  Siupecki's  extension  of 
being  truth  functionally  operator-complete  translates,  via  Theorem  2  of  Section  3.4,  into 

X 

the  fact  that  (R\R,  V,  =>  ,  (0 1 0))  is  a  truth  functionally  operator-complete  system 
relative  to  all  possible  extended  Boolean  conditional  operators.  Indeed,  going  back  to 
since  max  ("or"),  min  ("and and  1- •  ("not")  can  all  be  shown  to  be  compounds  only 

X  X 

of  3  ,  so  that  (=r ,  u)  is  sufficient  to  span  operationally  all  three-valued  truth-functional 
operators,  hence  by  Theorem  2  of  Section  3.4  again,  the  corresponding  conditional 
operators  must,  likewise,  span  all  possible  extended  Boolean  conditional  operators. 
Similarly,  the  enlarged  Schay-Adams-Calabrese  system,  corresponding  to  the  enlarged 
Sobocinski  logic,  is  truth  functionally  operator-complete. 

Thus,  via  Theorem  2  of  Section  3.4,  one  can  now  justify  the  Schay-Adam-Calabrese 
approach  to  conditional  event  algebra  as  being  equivalent  to  Sobocinski's  logic,  however, 
as  noted  earlier  in  Section  3.5,  quite  distinct  from  Lukasiewicz's  logic,  the  monotone 
bound  violations  for  conjunction  and  disjunction  notwithstanding!  See,  however,  Chapter 
8,  Section  8.2. 

4.  Using  the  technique  of  Theo:em  2,  Section  3.4,  we  obtain  the  following 
three-valued  truth  tables  for  corresponding  conditional  operators: 

(i)  Recall  from  Chapter  4  that  ( R\R )  is  a  relatively  pseudo-complemented  lattice 
with  relative  pseudo-complementation  -» given  by: 


(c\d) -  (p\b)  =  (ab  V  c'd  V  b'd' \b  V  c'd  V  b'd') . 
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Its  truth  table  is 


[  1  for  (i,f)  e  {(0,  0),  (0,  it),  (0, 1),  (u,  u),  ( u,l ),  (1,  2)} 
V(ij)  =  |  «  for  ( i,j )  =  (2,  ■!) 

{  0  for  (i,f)  6  {(u,  0),  (1,0)}  . 

(ii)  In  particular,  the  pseudo-compltnentadon  operator  of  (R\R)  is: 

(a\b*  =  (a' b\l) 


with  truth  table 


\}K0)  =  1,  \{Ku)  =  0  =  yKl), 

which  is  that  of  negation  in  Heyting's  three-valued  logic  (as  mentioned  in  Section  3.5). 

(iii)  The  material  implication  on  (R\R),  (c\d)  a  (a\b)  =  (c|d)'  V  (a | b),  has  truth 
table  given  by 


W’J)  =  ) 


1  for  ( i,j  -  ((0,  0),  ( O.-u ),  (0, 1),  (u,  1),  (2,  2)} 
u  for  ( i,f  e  {(u,  0),  (u,  u),  (1,  «)) 

0  for  (i i,j )  =  (2,  0 )  . 


(iv)  Slupecki's  T-operator  (Rescher,  1969,  p.  163)  has  the  following  truth  table 
corresponding  to  the  constant  function  u,  that  is,  y(i)  =  u,  for  i  e  [0,  u,  1}  . 


CHAPTER  7 

FUZZY  CONBmONALS 


This  chapter' is  devoted  to  the  extension  of  the  measure-free  conditioning  concept  to 
•the  fuzzy  case.  Motivated  by  a  random  set  connection  and  by  the  concept  of  generalized 
indicator  function  of  conditional  events,  a  form  of  membership  functions  for  fuzzy 
conditionals  is  proposed.  It  turns  out  that  fuzzy  conditionals  arc  interval-valued  fuzzy  sets. 
Syntax  considerations,  as  well  as  probability  qualification  of  fuzzy  conditionals,  are 
investigated.  Prior  to  a  formal  development  of  fuzzy  conditionals,  basic  aspects  of 
fuzziness  and  fuzzy  logics  are  reviewed.' 

7.1  Generalities  on  fuzziness 

The  reader  is  referred  to  Klir  and  Folger  (1988)  for  an  introduction  to  the  theory  of 
fuzzy  sets,  and  to  Zadeh  (1988)  for  an  excellent  exposition  of  fuzzy  logic  and  its 
applications. 

Human  communication  is  based  on  natural  language.  Natural  language  contains  fuzzy 
concepts  such  as  "high;"  "almost,"  "likely,"  "intelligence,"  etc.  From  a  human  viewpoint, 
fuzziness  is  well-understood,  and  can  be  taken  as  a  primitive  notion.  The  uncertainty  in 
fuzziness  is  much  more  complex  than  that  in  randomness.  Indeed,  imprecision, 
subjectivity,  and  context  dependency  surround  each  fuzzy  label  in  natural  language.  The 
imprecision  and  the  context  dependency  of  the  above  examples  of  fuzzy  labels  are  clear. 
By  subjectivity,  we  mean  that  individuals  might  "understand"  a  fuzzy  label  in  different 
ways.  In  other  words,  fuzzy  concepts  are  not  universal  (or  objective),  as  opposed  to,  say, 
mathematical  concepts.  This  is  perhaps  the  main  source  of  difficulty  in  trying  to  formulate 
a  semantic  (meaning)  information  theory.  See  also  MacLennan  (1988). 

Fortunately,  there  exists  such  a  "thing"  as  common  sense  knowledge  which  allows  us 
to  approximate  fuzzy  concepts  in  a  reasonable  fashion.  Consider,  for  example,  the 
information  "the  temperature  is  high."  A  little  reflection  will  reveal  that,  underlying  this 
statement,  there  are:  a  universe  of  discourse  X,  namely  the  range  of  the  (variable) 
temperature;  the  variable  "temperature"  £  itself;  and  the  fuzzy  predicate  a  =  "high." 
Thus,  the  above  information  is  translated  into  '%  is  a."  For  this  translation  to  be  part  of  a 
knowledge  representation  language,  we  need  to  model  a  more  concretely.  With  respect 
to  X,  a  is  "inside"  X.  The  standard  approach  to  translate  this  vague  idea  into 
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mathematics  is  to  regard  a  as  a  sort  of  subset  of  X.  Specifically,  the  imprecision  in  the 
word  "high"  forces  us  to  consider  a  as  a  generalized  subset  of  X ,  in  the  sense  that 
membership  in  a  ranges  over  the  unit  interval  [0,  2]  rather  than  just  {0,  2}  as  in  the 
case  of  ordinary  subsets  of  X.  Generalizing  the  concept  of  indicator  functions  of  ordinary 
sets,  a  semantic  modeling  of  the  fuzzy  concept  a  is  given  by  a  membership  function 
jia:X-i  [i 0 , 1),  where,  for  each  x  e  X,  /ifl(x)  is  to  be  interpreted  as  the  degree  to  which 
x  is  compatible  with  the  meaning  of  a.  Also,  Hjx)  can  be  interpreted  as  the  truth  value 
of  the  proposition  "x  is  a  member  of  a."  Defining  this  way,  a  is  referred  to  as  a  fuzzy 
subset  of  X.  (See  Section  7.3  for  a  syntactic  approach  to  fuzzy  sets.) 

Now,  the  subjectivity  becomes  apparent  For  the  same  a ,  different  individuals  can 
assign  different  maps  j The  situation  is  diametrically  opposite  to  that  in  random 
analysis  where  each  random  phenomenon  is  governed  by  one  and  only  one  distribution 
law.  When  a  random  law  is  unknown,  one  can  try  to  gather  relevant  statistical  data  to 
estimate  it  or  to  test  about  it.  This  is  possible  since  the  law  in  question  is  unique. 

From  the  above,  we  see  that,  to  each  fuzzy  concept  a  (relative  to  X),  there  are 
different  interpretations  of  its  meaning  representation  at  the  mathematical  level.  This 
flexibility  is  sometimes  beneficial.  For  example,  users  of  a  consulting  system  can  input  his 
their  own  perception  about  fuzzy  concepts. 

At  the  level  of  application,  a  common  sense  membership  function  might  be 
desirable.  This  p.a  can  be  obtained  in  various  ways.  For  example,  by  bias  of  profession, 
a  statistician  might  immediately  think  about  getting  p.Q  by  collecting  data,  say,  in  the 
form  of  questionnaires  and  by  constructing  /ifl  based  upon  a  frequency  approach. 
Perhaps,  this  objective  approach  to  constructing  membership  functions  of  fuzzy  sets  has 
triggered  statements  such  as  "probability  theory  can  handle  fuzziness."  We  emphasize  the 
fact  that,  while  at  the  practical  level,  a  probabilistic  approach  to  modeling  fuzzy  concepts 
is  reasonable  (but  not  the  only  one),  the  primitive  concept  of  fuzziness  is  clearly  different 
from  that  of  randomness.  In  fact,  a  coexistence  of  these  two  notions  is  useful  in  Machine 
Intelligence.  Moreover,  fuzziness  has  the  luxury  of  producing  membership  functions  from 
human  perception,  when  statistical  data  are  not  available.  However,  at  the  membership 
function  level,  there  is  a  specific  relationship  between  fuzziness  and  probability  theory  via 
the  concept  of  random  sets.  This  relationship  shows  that  fuzziness  is  a  weak  specification 
of  random  sets  through  the  one-point  coverage  function.  (See  Section  7.4  for  an 
application  of  this  relationship.) 

It  is  appropriate  here  to  say  a  few  words  about  uncertainty.  Statements  like  "all 
statisticians  agree  on  the  use  of  probability  to  model  uncertainty"  (French,  1990)  should  be 
clarified  a  little  further.  By  uncertainty  in  statistics,  we  mean  a  very  specific  type 
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of  uncertainty,  namely  randomness.  It  is  now  well-accepted  that,  outside  of  statistics, 
especially  in  AI  models,  there  is  a  clear  distinction  between  uncertainty  and  probability 
(see  for  example,  Bellman,  1978;  Levi,  1973;  Neapolitan,  1990).  More  specifically, 
probability  theory  models  one  type  of  uncertainty,  while  in  general  decision  theory,  other 
types  of  uncertainty  may  surface.  Of  course,  by  analogy  with  randomness,  one  can  try  to 
use  statistical  methodologies  and  techniques  to  model  or  to  approximate  other  types  of 
uncertainty  (see  for  example,  Mosteller  and  Youtz,  1990).  But  the  intrinsic  property  of 
each  type  of  uncertainty  remains  unchanged  (see  the  comments  of  N.  Cliff  following  the 
article  of  Mosteller  and  Youtz,  p.  16-18).  In  our  view,  other  non-probabilistic  approaches 
to  uncertainty  modeling  are  not  alternatives  to  statistical  tools.  Rather  they  address 
different  problems  in  which  the  uncertainty  involved  is  not  statistical  in  nature  (see  for 
example,  Neapolitan,  1990).  This  is  similar  to  the  situation  in  quantum  probability  (for 
example,  Gudder,  1988).  The  concept  of  fuzziness,  as  an  example,  is  best  explained  in  the 
context  of  semantic  processing  of  natural  languages.  (See  again  Neapolitan,  1990;  also 
Levi,  1973).  There  are  various  reasons  for  ad-hoc  uncertainty  modeling  to  be  attractive  to 
desig..  a  of  intelligent  machines.  This  is  a  healthy  sign  in  view  of  AI  problems.  For  the 
problem  of  admissibility  of  uncertainty  measures  in  expert  systems,  see  Goodman,  Nguyen 
and  Rogers  (1990). 

So  far,  we  have  discussed  the  problem  of  meaning  representation  of  fuzzy  concepts. 
Whatever  approaches  are  taken,  we  have  a  collection  of  membership  function  /£fl,  a  e  A, 
say,  in  a  knowledge  base  of  some  system.  The  problem  of  interest  is  how  to  combine  them 
in  order  to  extract  information  for  decision  processes.  This  is  basically  the  problem  of 
using  logic  as  a  formal  tool  in  artificial  intelligence  (see  for  example,  Ramsay,  1988). 
More  specifically,  a  formal  logic  will  provide  us  with  a  way  of  constructing  a  meaning 
representation  language  in  which  facts,  rules  and  deduction  (for  inference)  can  be  stated. 
In  this  spirit,  we  are  going  to  look  at  logical  aspects  of  fuzzy  sets. 

7.2  Fuzzy  logics 

Roughly  speaking,  fuzzy  logic  is  a  knowledge  representation  language  in  which  facts 
and  rules  involving  fuzzy  information  can  be  represented  mathematically,  and  in  which 
inference  with  fuzzy  data  can  be  described  logically.  Fuzzy  logic  is  essentially  a  logic  that 
models  the  fuzziness  in  natural  language. 

To  avoid  confusion,  it  is  necessary  to  classify  different  types  of  fuzzy  logics. 
First-order  fuzzy  logics  refer  to  logics  of  fuzzy  sets  in  which  the  truth  space  is  the  unit 
interval  [ 0 ,  /).  A  fuzzy  logic  is  called  second-order  if  its  truth  space  is  the  space  of  fuzzy 
subsets  of  [0, 1].  In  any  case,  fuzzy  logics  are  multivalued  logics. 

A  standard  first-order  fuzzy  logic  is  proposed  by  Zadeh  by  specifying  semantic 
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operations  among  fuzzy  subsets  of  X  as  follows.  The  class  of  all  fuzzy  subsets  of  X  is 
the  set  of  maps  ^(X)  =  {f :  X -*  [0, 1])  from  X  into  [0, 2).  For  / e  5*(X),  the  negation 
/'  of  /  is  defined  to  by  /'(x)  =  1  -f(x).  For  /,  g  e  ^(X),  the  "truth  tables"  for  the 
connectives  "and,"  "or"  are,  respectively 

(f  A  g)(x)  =f{x)  A  g(x)  =  min(f[x),  g(x)), 
and 

(f  V  g)(x)  =f(x)  V  g(x)  =  max(f(x),  g(x)). 

With  respect  to  the  truth  space  [0,  2],  these  are  Lukasiewicz's  logical  operations.  Of 
course,  this  standard  first-order  fuzzy  logic  generalizes  classical  two-valued  logic.  See  fQir 
and  Folger,  1988,  Section  1.6,  for  details.  This  approach  is  semantic  in  the  sense  that  the 
objects  under  study  are  membership  functions,  generalizing  indicator  functions  of  ordinary 
subsets  of  X7  rather  than  their  counterparts  of  ordinary  subsets  of  X.  This  point  will  be 
made  precise  in  the  next  section. 

When  the  above  logical  operations  are  applied  to  fuzzy  subsets  of  [0, 2]  viewed  as 
truth  values  in  a  second-order  fuzzy  logic,  the  resulting  logic  is  referred  in  the  literature 
simply  as  fuzzy  logic.  See  Zadeh  (1988)  for  additional  details.  When  the  truth  space  is 
[0,  2],  one  can  model  the  basic  connectives  "not,”  “and,"  "or"  in  various  ways,  extending 
however  classical  two-valued  truth  tables  of  these  connectives.  For  negation  (or  fuzzy 
complement),  one  can  use  any  negation  operator ,  that  is,  any  function 

such  that 

(i)  N(0)  =  1,  N(l)  =  0, 

(ii)  N  is  continuous, 

(iii)  N  is  involutive,  that  is,  N(N(x))  =  x,  Yx  e  [0,  2],  and 

(iv)  N  is  non-increasing. 

See  for  example,  Bonissone  and  Decker,  1988. 

For  conjunction,  it  turns  out  that  the  class  of  t-noms  (see  Schweizer  and  Sklar,  1983) 
is  appropriate  to  represent  conjunction  operators,  where  a  r-nonn  is  a  binary  operation  T 
on  [0,  2]  such  that 

(i)  T  is  associative, 

(li)  T  is  commutative 

(iii)  T  is  nondecreasing  in  each  place,  that  is,  if  xSy  and  u<v  then 
T(.r,  u)  <  T(y,  v),  and 
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(iv)  for  x  €  [0, 1],  T(x,  1)  =  x. 

Note  that  (iii)  and  (iv)  imply  that  T(0,x)  =  0,  Vx  e  [0, 1],  in  particular,  T(0,  0 )  =  0. 
Indeed,  Vi  e  [0, 1],  T(0,  x)  £  T(0, 2)  =  0.  A  /-norm  T  is  "Boolean-like"  in  the  sense  that 

2 

its  restriction  to  the  vertices  of  [0,  2]  is  a  Boolean  function,  that  is,  T(x,  y)  =  0  or  2 
whenever  x  and  y  are  0  or  2.  These  functions  are  used,  for  example,  in  neural 
networks  to  model  activation  functions  of  the  units  in  the  network.  See  for  example, 
Williams.  (1986).  Here,  values  in  [0,  2]  are  viewed  as  degrees  of  activation.  Note  that 
the  associativity  of  /-norms  is  essential  in  extending  these  binary  operations  to  n-ary 
operations  on  [0,  2],  n  i>  2. 

Some  common  examples  of  /-norms  are  these: 

Tfic,  y)  =  min[x,  y}, 

T2(x,y)=xy, 

T3(x,y)=max[x  +  y-l,0). 

For  disjunction,  the  class  of  t-conorms  is  appropriate.  A  /-conorm  S  is  a  binary 
operation  on  [0,  2]  such  that 

(i)  S  is  associative, 

(ii)  S  is  commutative, 

(iii)  S  is  nondecreasing  in  each  place,  and 

(iv)  S(0,  x)  =  xj  and  S(l,  x)  =  1,  for  all  x  e  [0,  21. 

/-norms  and  r- conorms  are  dual  in  the  following  sense.  If  T  is  a  /-norm,  then 

S(x,y)  =  l-T(l-x,l-y) 


is  a  /-conorm,  and  if  S  is  a  /-conorm,  then 

T(x,  y)  =  1  -  S(1  -x,  1  -y)  is  a  /-norm. 

If  the  negation  operator  N  is  defined  by  N(x)  =  2  -  x,  then  dual  /-norms  and 
r-conorms  are  related  to  N.  Each  triple  ( N ,  T,  S )  defines  a  first-order  fuzzy  logic.  Thus, 
one  can  speak  of  fuzzy  logics  (in  the  plural). 

The  /-conorms  dual  to  Tj,  are 


Sj(x,  y)  =  max{x,y}, 
S2(x,  y)=x  +  y-xy, 
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tyx,  y)  =  mn{x  +  y,  1). 

The  triple-  (N,  T,  S)  given  jy 

N(x)  =  1  -x, 

T(x,  y)  =  min{x,  jy},  and  , 

S(x,  y)  =max[x,y }, 

forms  the  collection  of  basic  operations  on  fuzzy  sets.  It  is  interesting  to  note  that  some 
r-norms  admit  probabilistic  interpretations.  For  example,  if  the  r-norm  T  is  such  that  for 
x  <  u:  and  y  <  v, 

Tin,  y)  -  T(x,  y)  <T(u,  v)  -  T(x,  v),  * 

then,  T  is  a  two-dimensional  copula  (see  Schweizer  and  Sklar,  1983).  That  is, 

T:[0,  J)2->[0, 1) 

satisfies  (*)  and 

a)  T(0,  x)  =  T{x,  0)  =  0,  for  *  e  [0,  i],  and 

b)  T(l,  x )  -  T(x,  1)  =  x,  for  *  e  [0, 1). 

t-norms  satisfy  all  axioms  of  two-dimensional  copulas  (or  copulas  for  short)  except 
possibly  (*).  In  general,  copulas  are  not  associative.  The  probabilistic  interpretation  of 
copulas  is  this.  The  distribution  function  of  a  random  variable  £  that  is  uniformly 
distributed  on  [0, 1]  is 

!0  for  y  <0 
x  for  0  <,  x<l 
1  for  x>l  . 

If  (£,  7])  is  a  random  vector  with  joint  distribution  function 

G(x,  y)  =P£<x,ri<  y), 

then  the  marginal  distributions  are 

Fffc)  =  P(?  <  r,  7]  <  +«)  =  G(x,  +«) , 
and 
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=  HZ  ^  rj  <,  y)  =  G(+»,  y) . 

If  F^  and  are  both  equal  to  F,  then  the  restriction  of  G  to  [0,  l]2  is  a  copula. 
Conversely,  if  T  is  a  copula,  then  H  :  R2  -*  [0, 1]  defined  by 

H(x,  y)  =  7(F(x),  F(y)) 

is  a  two-dimensional  distribution  fimction  each  of  whose  marginal  distributions  is  F, 
where  IR  denotes  the  set  of  real  numbers. 

Thus,  roughly  speaking,  a  copula  is  nothing  more  than  a  two-dimensional  distribution 
2 

function  on  [0, 1]  with  uniform  marginal  distributions  on  [0,1].  A  basic  result  in 
Schweizer  and  Sklar  (1983)  is  this.  If  H  is  the  joint  distribution  function  of  (£,  rf),  -then 
there  is  a  copula  T  such  that 

Fl{x,  y)  -  T(H(x,  +»),  //(+«,  y))t  Vx,  y  e  [R  . 

The  t-norms  TpT^T^  above  are  all  copulas.  For  more  details,  see  Schweizer  and  Sklar, 
1983. 

7.3  Syntax  representation  of  fuzzy  sets 

Let  X  be  a  set.  The  power  set  of  X  is  denoted  by  F>(X).  One  can  identify  S°(X) 

Y 

with  the  space  [0,1]  =  [f :  X  -» [0, 1 } }  via  the  bijection 

(p:  {0,l)X-*?QO 

defined  by 

<p(f)=f'1(l)={x:M  =  l}  . 

Two  remarks  are  in  order  here.  First,  if  a  e  y{X),  then  (fha)  =  1  the  indicator 
ainction  of  a  on  X.  If  FQC)  represents  a  collection  of  propositions,  then  it  is  the  base 
space  of  classical  two-valued  logic  or  the  "syntax  part"  of  the  logic.  For  each  xeX,  the 
map  hx  :  F'QO  -»  [0,1]  defined  by 

r  1  if  xe  a 

hx(a)  = 

X  1  0  if  xe  a' 

is  a  Boolean  homomorphism,  that  is,  a  model  of  C2.  (See  Chapter  6).  Thus,  the  space  of 
indicator  functions  { 0 , 1)  plays  the  role  of  "semantic  part"  of  the  logic,  in  the  sense 
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that,  given  a  model'  h  or  simply  x,  a  is  true  or  false  in  x  according  to  whether 


la(x)  =  1  or  0. 


Second,  the  above  bisection  <p  can  be  written  in  a  more  explicit  form: 


with  /  (2)  cX.  If  we  defme,  for  each  r  e  [0,1], 


then  for  t  >  0, 


Aft)  =  U  :f{x)  Z  t)  , 


A0(f)  =  X,  At(f)  =r  (i). 


Conversely,  given  (a,  X),  with  acX,  then 

la(x)  =  sup[t  e  [0, 

where  X,  =  a,  Vr>  0,  is  such  that  -Aft  ft 

These  facts  are  carried  over  to  the  fuzzy  case  in  a  straightforward  manner  as  follows. 
In  the  standard  approach,  membership  functions  are  used  to  model  fuzzy  concepts.  Thus, 

the  "semantic  part"  of  fuzzy  logic  is  5f{X)  =  [0, 7]^.  The  "syntax  part"  is  obtained  as  in 
the  case  of  two-valued  logic.  Specifically,  if  /e  &QC),  then  the  level  sets  (or  a-cuts)  of  / 
are,  for  a  e  [0, 1],  Aft)  =  {x  :f{x)  >  a).  (See  for  example,  Dubois  and  Prade,  1980.) 
The  family  of  ordinary  subsets  A a,  a  e  [0,  2],  of  X  satisfies  the  following  properties: 

(i)  a<P  implies  A^cA^, 

(ii)  Aq  -  X,  and 

iii)for  IS  10,1],  £Aa  =  ASupI. 

The  condition  (iii)  is  a  form  of  left-continuity  of  the  map  A  :  [ 0 , 1 ]  -»  &(X)  defined 
by  a  -» Aa  in  the  sense  that,  for  o:  e  [0, 1], 

UmA  =  nA= A  - = A  , 
a]a  a  a<a  a  a  a 

where  A„+  denotes  / im  /i  =  u  A„). 

Or  ^  I  CC  _  O, 

a\a  a>a 

It  turns  out  that  these  three  conditions  characterize  the  syntax  part  of  fuzzy  logic. 
Indeed,  let  us  call  a  family  [Aa>  a  e  [0,1]}  of  subsets  of  X  zflou  set  (for  example, 
Gentilhomme,  1968;  Negoita  and  Ralescu,  1975)  if  the  A’s  satisfy  (i),  (ii)  and  (iii) 
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above.  Denote  the  class  of  all  flou  sets  of  X  by  &  JfQC),  and  consider  the  map 
(p :  ^(X)  -i  5?  AX)  defined  by  (p(f)  =  [A^f),  cc  e  [0,  i]}.  Note  that,  a  flou  set  is  in  fact 
a  map  A  :  [0,  i]  -*  $>QQ  given  by  a  -*  A a  ,  and  we  write  A  =  {A^  a  e  [0,  2] }  for 
simplicity.  Thus,  two  flou  sets  A  =  [A^  cce  [0, 2]}  and  B  =  [B^  a  e  [0,  i]}  are  equal 
if  and  only  if  Aa  =  By  for  a  e  [0, 1].  It  is  easy  to  check  that:  p  is  a  bijection.  Indeed, 
if  $f)  -  <P(g)>  then  Ayf)  =  Ayg),  for  a  e  [0, 1].  But,  obviously  for  x  e  X, 


M  =  sup{a : x  e  4a(/)} , 

and  hence  f—g,  that  is,  (p  is  one-to-one.  To  show  that  (p  is;  onto,  we  take  an  arbitrary 
flou  set  A  =  {A^,  a  e  [0,  2]  },  and  consider  its  "characteristic  function" 

Ya:X-<[0,II 

where 


yA(x)  =  sup{a:xeAa] . 

We  are  going  to  show  that  A^  is  a  a-level  set  of  yA>  Vae  [0, 1].  If  x  e  A^,  then  by 
construction,  yA(x)  >  a.  Conversely,  let  x  be  such  that  yA(x)  >  a  and 
4  =  {P  ’• x  6  we  have  =  sup  4*  ®y  condition  (iii),  n  A^  =  A  By 

(ii),  A ^  ^  cA^.  Thus  {x  :  yA(x)  ^  cc)  cA^,  and  the  result  follows. 

The  logical  operations  on  &  JfQC)  can  be  defined  in  such  a  way  that  <p  is  an 
isomorphism.  For  this  purpose,  conjunction  and  disjunction  are  defined  as  follows.  For 
A  =  {A^,  a  e  [0,  21}  and  B  =  [B^  a  e  [0, 1}}, 


AhB=  [A(xnBa,  a  e  [0, 1]} 


and 


AV5={AauBa,ae[0,i]}. 

With  respect  to  these  operations,  Negoita  and  Ralescu  (1975)  have  established  a  lattice 
(A,  V)-isomoiphism  between  &(X)  and  S?  AX).  This  can  be  seen  by  observing  that  for 
f,g:X-t  [ 0 ,  7],  we  have  for  a e  [0, 1 ], 


{x  :  f(x)  A  g(x)  >  a)  =  {x  :f(x,  >  a)  n  [x :  g(x)  >  a }  , 
and 


[x  :f(x)  V  g(x)  >  a)  =  {r  :f{x)  >  a)  u  {x  :  g(x)  >  a)  . 
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The  "negation"  operator  on  Sf  JfQO  is  defined  as  follows.  From 

A  =  {AaJae[0,  i]}, 

look  at  its  "characteristic  function" 

yA(x)  =  sup[a :  x  e  Aa). 

Consider  yA(x)  =  1  -  yA(x),  for  xzX.  Let  A'  =  {A &  a  a  [0,  2]},  where  A'a  is  the 
ct-level  set  of  YA(-),  that  is, 

Aa  =  ^ :  VAW  S 
=  [x  :  yA(x)  £  2  -  a) 

=  {x  ;  sup[fi  :  x  e  A^}  £  1  -  a) 

=  X\{x:  yA(x)  >l~a) 

=  X\  u  (x:yA(x)>p) 

P>l-cc  A 

=  XXA(l-a)+ ‘ 

The  negation  of  A  is  taken  to  be  A'  above. 

Theorem  1.  <p  is  an  isomorphism  between  (•)',  A,  V)  ami  Jf(X),  (•)',  A,  V). 

Proof.  In  view  of  the  previous  analysis,  it  remains  only  to  show  that  (p  preserves  the 
logical  operations.  The  preservation  of  A  and  V  is  obvious.  That  of  (O'  follows  from 
the  fact  that  if  <p(/)  =  A  =  {A^,  a  e  [0, 1]},  then  /  =  yA,  and  from  the  definition  of  (•)/ 
on  &  o 

More  concretely,  flou  sets  can  be  identified  with  partitions  of  X  as  follows.  By  a 
partition  of  X  we  mean  a  map  Q :  J  -*  &(X),  where  J  c[0, 1],  satisfying 

(i)  for  a  s  J,  Qa  *  0, 

(ii)  for  a,  /?  e  /  with  a*  fi.  Q  n  Qo  =  0,  and 

(iii)  u2„  =  I 


Note  that  a  rnywa/  partition  of  X  is  the  range  of  a  partition  (map)  in  the  above  sense. 
As  in  the  case  of  flou  sets,  two  partitions  :  Jj  <P(X),  Q^:  J2  -»  &(X) 
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are  equal  if  and  only  if  Jj  =  ^  and  =  Q^\  for  a  s  Jj. 

Theorem  2.  Let  0P(X)  be  the  space  of  all  partitions  of  X,  and  7] :  IP(X)  -♦  d?  -f(X)  be 

defined  by  T](Q)  =  A,  where  Q  :J~*  S°(X)  and,  A  -  u  Qn.  Then  t\  is  a  bisection. 

a  psJ  P 
J3>a  „ 

Proof.  First,  7](Q)  so  defined  is  indeed  a  flou  set.  Since  Q  is  a  partition,  Aq  =  X. 
Obviously,  by  construction,  for  a,  p  e  [0, 1],  if  a£p  then  A^  c  A a.  Finally,  the 
"left-continuity"  of  A  is  seen  as  follows.  Let  I  c  [0, 1].  If 


then  by  monotonicity  of  A, 


ar 

p>supl 


xs  n  (  u  QA 
as!  psJ  P 
p<a 


Conversely,  if 


n  (  u  QA 
asl  PsJ  P 
P>a 

then  for  a  s  /,  there  exists  psJ,P>a  such  that  xs  Qp  But  Q  is  a  partition,  so 
there  is  only  one  value  of  Q  that  contains  x,  say  Q^xy  Thus  P(x)  >  a,  for  asl,  and 
hence  sup  I  <  p(x),  implying  that  x  s  A  j. 

To  show  that  rj  is  onto,  we  proceed  as  follows.  Let  A  -  { A ^  a  s  [0,  2]}  be  a  flou 
set  of  X.  By  Theorem  l,  A  is  uniquely  determined  by  its  characteristic  function  yA.  Let 
J A  c  [0, 1]  be  the  range  of  yA,  that  is,  if  and  only  if 

{xsX:  <pA(x)  =  a)  =  (<pA  =  a)  =  y'A  (a) * 0. 

Obviously,  JA  *  0.  For  j3  s  J^,  define 

e/3 =  W  =  Vahp,  n  \  vAhp.  n  = 

where  A^+=  u^y  with  Aj+=0.  Obviously,  ^  0  for  p  s  J^.  By  the  definition 

of  J A,  if  a,  P^JA  and  a^/3,  we  have  Qa  n  -  0.  Finally,  if  xsX,  tlien 
xs(yA  =  yA(x))  so  that  xeQa  with  a  =  yA(x)  s  JA.  Thus  Q  =  (Qp,  p  e  JA)  is  a 
panition  of  X.  It  remains  to  check  that  t](Q)  =  A.  But 
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Aa  =  V~A  [a‘  i]  =  „u  =  „u  % 
a  A  fca  A  feJA  P 

p<0T 

This  last  equality  follows  from  the  fact  that  if  /3  >  cc  and  Q  e  7^,  then  y~^(p)  =  0. 

Tc  show  that  T]  is  one-to-one,  we  suppose  7](q!^j  =  r ](Q^)  =  A,  where  Qf^  and 
q{2)  are  two  partitions  of  X,  with  domains  and  J{2\  respectively.  From  the  above 
discussion,  we  see  that  =  range  of  y,.  Also,  for  ji  e  =  A'\ 

Qp1  =  Q(p}  =  a 

Remarks,  (i)  In  view  of  Theorem  2  and  the  algebraic  structure  of  Sf  Jf(X),  one  can 
define  logical  operations  on  IP(X)  so  that  the  bisection  77  is  an  isomorphism. 
Specifically,  if  Q  :  J  c  [0, 1]  3s  (X)  is  a  partition  of  X,  then  its  "negation"  is  the 
partition  Q'  :  1  -  /  -*  9QC)  defined  by  Q'  =  Qj_a  -This  is  justified  as  follows.  Let 
A  =  rj(Q).  Then  for  (5e  J,  =  (yA  =  p) .  The  "complement"  of  y^  is  y'^~  1  -  y^. 
The  range  of  y'A  is  7-7.  Thus  Q'=(y'A-ct)  for  ael-J,  that  is, 
Q'=(^  =  2.a)  =  2/,a. 

For  i  =  7,2, let  Q(i)  :  ft  -*  f>(X)  and  A(i)  =  7}{Q(i)).  The  "conjunction"  of  y  /n 

A{1) 

and  y  ^  V  (j)  A  ¥  ( 2 )’  Q  :  (j)  A  y  ^))  "*  &W)  by 

AAA  A  A 

Q  =  (y  ,n  Ay  n]-  a).  Q  is  taken  to  be  the  "conjunction"  of  and  ^2\ 

a  A{1)  A{jL) 

Similarly,  the  "disjunction"  of  and  of2^  is  the  partition  defined  on  the  range  of 

\m v  by  (V> v  va<2> = a)' 

(ii)  A  similar  isomorphism  can  be  established  between  5*  Jf(X)  and  the  class  of 
nested  random  sets  of  X.  Specifically,  by  a  nested  random  set  of  X,  we  mean  a  random 
element  S,  defined  on  the  probability  space  (fl,  P )  with  values  in  i°(X),  of  the  form 
S(<u)  =  where  A  =  {A  ^  a  e  [0,  2]}  is  a  flou  set,  and  U  is  a  random  variable, 

defined  on  (Q,  ^4  P)>  and  without  loss  of  generality,  uniformly  distributed  on  [0,  7].  For 
a  fixed  U,  consider 


&(£/)=  {Au :  A  e  PJfQO) 


It  is  easy  to  check  that  the  map 


t:  PjWQ-  &(U) 

defined  by  T(/i)  =  Ay  is  an  isomorphism.  For  more  details  on  algebraic  and  probabilistic 
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bases  for  fuzzy  sets,  see  Goodman  (1990). 

7.4  Fuzzy  conditionals 

As  in  any  logic,  the  concept  of  fuzzy  entailment  (or  implication)  in  fuzzy  logic  is 
essential  for  inference  purposes.  Consider  a  conditional  rule  of.  die  form 

"If  X  is  a  then  Y  is  b", 

where  a,  b  are  fuzzy  subsets  of  some  set  Cl,  say,  and  X  and  Y  are  variables  taking 
values  in  Cl.  In  the  theory  of  possibility  (Zadeh,  1978),  the  possibility  distribution  of  X 
(resp.  Y)  is  taken  to  be  the  membership  function  p.Q  (resp.  /tp  with  the  interpretation 
that 


PossQC  =  £0)  =  p.a(co),  0)6  Cl. 

Thus,  a  conditional  rule  of  the  above  form  can  be  viewed  as  a  "fuzzy  conditional." 

In  the  past,  various  approaches  to  defining  conditional  possibility  distributions  have 
been  proposed.  Let  fix,  y)  denote  the  joint  possibility  distribution  of  Qi,  Y),  and  fj 
(respectively,^)  denote  the  marginal  possibility  distribution  of  X  (respectively,  Y),  where 

fj(x)  =  sup  f{x,  y). 

y 

In  Nguyen  (1978),  the  conditional  possibility  distribution  of  Y  given  X  is  defined  by 

fiy\x)  -fix,  y)  max[l,fjix)lf2(y)} , 


and  in  Hisdal  (1978)  as 


fix,  y)  if  fj(x)  >f(x,y) , 
[f(x,  y),  1]  if  fj(x)  =fix,  y) . 


Bouchon  (1987)  proposed  two  types  of  conditional  forms.  Let  f :  Cl j-*  [0, 1], 

g  :  Cl2~>  [ 0 , 1 ]  and  T  be  a  continuous  r-nonn. 

(i)  ifix)\giy))T  =  sup[t :  t  e  [0, 1],  Tigiy),  t )  <fiy)}  with  two  special  cases.  First,  for 
T(x,  y)  =  min[x,  y], 


ifix)\g(y))T=  { 


1  if  fix)  >  giy) 
fix)  if  fix)  <  g(y). 
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Second,  for  T(x,  y)  =  xy. 


<f(x)\g(y))T  =  min[f(x)lg(y),  1). 

(ii)  Let  h  :  [0, 1]  -»  R+  be  a  non-increasing,  continuous  function  with  h(0)  £  +»  and 
h(l)  =  0.  Let  N^t)  =  h'\h\0)  -  h(t)),  a  negation.  Then 

W*)\g(y))N  =  max{Nn(g(y)),f(x)}. 
h 

The  approach  (ii)  is  clearly  a  generalization  of  the  use  of  material  implication  when 
h(j.i)  =  1  -  x.  For  other  works  on  fuzzy  implication  operators,  see  Yager  (1983),  Sembi  and 
Mamdani  (1979),  Mattila  (1986),  Smets  (1990). 

Goodman  and  Stein  (1989)  attempted  a  definition  for  fuzzy  conditioning,  based  upon 
the  fuzzy  set  analogue  of  the  basic  characterization  of  conditional  events  as  the  solution  set 
of  a  Boolean  linear  equation,  that  is,  [x :  x  e  R,  xb  =  ab).  Specifically,  if  S  is  a 
generalization  of  Zadeh's  classical  (min,  max,  1  -  (•))  system  over  the  set  of  all 
membership  functions  of  fuzzy  subsets  of  Cl  (called  there  a  semi-Boolean  algebra,  being  a 
complete,  bounded  distributive  DeMorgan  lattice)  with  conjunction  *  and  partial  order 
relation  <  then,  for  /,  g  e  S,  the  conditional  form  (f\g)  is  given  by 

(f\g)  =  [k:  he  S,  h*g  =f*g}. 


This  led  to,  for  xe  Cl, 


f(x)  if  f(x)  <  g(x) , 
[g(x),  1)  if  fix)  >  g(x) . 


Unfortunately,  unlike  the  Boolean  counterpart,  closure  of  functionally  extended  operations 
did  not  hold. 

In  this  section,  we  propose  a  new  approach  to  fuzzy  conditioning  using  random  set 
representations  of  fuzzy  set  membership  functions.  Let  X  be  a  set,  and  for  simplicity,  let 
the  Boolean  ring  R  be  $>( X ).  For  a,  b  e  R,  the  syntax  representation  of  the  conditional 
”a  given  b"  is  the  coset  (a  ]  b)  =  a  +  Rbr,  while  its  semantic  representation  (DeFinetti, 
1964;  Schay,  1968)  is  its  "generalized"  indicator  function 


defined  by 


(p(a\b):X-i  [0, 1,  u] 


Fuzzy  conditionals 


221 


'  1  if  xzab 

(pia\ b)(x)  =  •  0  if  xz  a'b  , 
u  if  x  zb' 

where  u  stands  for  "undefined." 

It  is  time  to  say  a  little  more  about  the  symbol  u.  -In  view  of  Lukasiewicz's 
three-valued  logic,  the  logical  operations  on  the  truth  space  {0, 1,  u)  are  defined  by 

O'  =  1,  V  =  0,  u'  =  u; 

0hl  =  0A0  =  0hu  =  0,  1  M  =  1,  uhl  =  uhu  =  u; 

0M  0  =  0,  0  V  1  =  1  V  1  =  uV  1  =  1,  0Vu  =  uVu  =  u. 

For  concreteness,  u  can  be  taken  to  be  a  number  in  ( 0 , 1),  say  u  =  i/2,  so  that  for  x,  y  6 
[0,1, 1/2), 

x'  =  1  -  x,  xVy  =  max(x,  y),  x  A  y  =  min(x,  y) . 

Consider  now  the  case  of  fuzzy  sets.  Our  approach  to  defining  fuzzy  conditionals  is 
based  upon  a  relationship  between  membership  functions  of  fuzzy  sets  and  canonical 
random  sets  which  are  induced  by  uniformly  distributed  random  variables.  See,  for 
example,  Goodman  and  Nguyen  (1985).  Specifically,  let  f:X-*  [0, 1],  let  (£2,  P)  be 
a  probability  space  and  U  be  a  random  variable  defined  on  it  and  uniformly  distributed 
over  the  unit  interval  [0, 1 ].  The  random  variable  U  is  thought  as  a  device  for 
randomizing  the  a-level  sets  of  /.  Thus  for  xzX, 

fix)  =  P(cd  :  U(co)  <Lf{x)) 

=Piu<f(x))  =  p«r1[o,m). 

In  this  way,/  is  the  one-point  coverage  function  of  the  canonical  random  set  Sy  defined 
by 

Syico)  =  [x  :x  e  X,  Uico)  <f(x)}  =/'7 [(/(*>),  i). 

That  is,/(x)  =  P((D  :  x  e  S(0))),  xzX. 

The  logical  operations  among  membership  functions  can  be  defined  using  this 

-1  1  r 

relationship.  First,  the  set  complement  of  U  [0,  fix))  is  U  [O.fix))  ,  and 
PiU'1[0,fix)]c)=Pixzf1[U,  /})=  l-P{U<f(x))  =  1  -Ax), 
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since  U  is  uniformly  distributed.  Thus,  !he  negation  of  /  is  1  -/.  Next,  let  f, 
g  :X  -» [0,  i],  and  U,  V  be  their  corresponding  uniformly  distributed  random  variables, 
both  defined  on  (Q,  y,  P).  The  joint  distribution  function  of  U,  V  is  a  copula  F  (or 
more  precisely  a  2-copula,  see  Schweizer  and  Sklar,  1983,  p.  82-83). 

To  define  the  conjunction  of  /  and  g,  we  look  at  the  set  intersection  of  their  random 
set  representations,  namely 

vho.mnvho.g »]. 

We  have 

nirho.m  n  rho,  sm  =  poc  e  rhu,  n  n  g*V,  id 

=  P(U<f{x),  V<g(x))  =  F(Ax :). 

Thus  the  conjunction  /  A  g  is  defined  for  xe  X  by  (f  A  g)(x)  =  F(f[x),  g(x)). 

For  disjunction  V  among  membership  functions,  we  look  at  the  set  union  of  their 
random  set  representations,  namely 

vho.m'jvho,  m- 

We  have 

Piu-ho.mwho.gm) 

=  p(.ir1[o,m)  +  p(y1[o,  g(x)D 

=  p(v1io,mnr1[o>g(x)]) 

=  P(A:6/'i[^l])  +  P(xeg‘i[F,I]) 

=  P(xef1[U,l]ng-1[V,l]) 

=  M  +  g(x)-F(f{x),g(x)) 

=  F*m,  g(x)), 

* 

where  F  is  the  dual  copula  of  F.  Of  course,  the  logical  system  above  of  membership 
functions  depends  upon  the  copula  F. 

The  procedure  above  is  carried  over  to  the  conditional  case  as  follows.  The 
conditional  counterpart  of  "f  given  g,"  denoted  as  (f\g),  is  the  conditional  event 

(U'ho,  fix)]  |  V'ho,  g(x)])  in  the  conditional  space  y0 1  J,  Thus,  it  is  natural  to  define 
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<fU)M  =  P(.U-1lO,mi\VIlO,  SCO)) 

=  P(V<m  |V<S«) 

_  F(l(x),g(x )) 

when  g(x)±0. 

For  (flgX*)  •  to  reduce  to  <p(a|2>)(‘)  when  f  -  1Q->B-  Ip  with  a,  b  being  elements 
of  a  field  of  subsets  of  X,  a  third  value  u  (undefined)  has  to  be  assigned  to  (f\g)(pc) 
when  g(x)  =  0. 

We  consider  first  a  special  case  in  which  the  copula  F  is  taken  to  be  min,  that  is 
Fix,  y)  =  min[x,y}.  We  use  also  the  symbol  A  for  minimum.  Also,  in  the  sequence 
u  =  [0, 1]. 


Definition.  Let  f  g  :X~*  [ 0 , 1].  The  semantic  part  of  the  fuzzy  conditional 


is  defined  by 


(f|S):X-.«W)u<u), 


tfls)CO=  ( 


/fc)A£(x) 


u 


when  g{x)  *  0 
when  g(x)  =  0 . 


As  in  the  case  of  ordinary  conditionals,  the  abstract  symbol  u  has  to  be  clarified  Tk:  s  is 
because  the  range  of  (f\g)(-)  involves  u,  implying  that  the  nature  of  fuzziness  of  !/|g) 
depends  on  u.  Of  course,  one  can  simply  imagine  that  u  is  an  abstract  symbol,  and 
define  the  logical  operations  (•)',  A,  V  on  [0, 1]  U  { 0 , 1}.  A  concrete  candidate  for  u  is 
the  whole  unit  interval  [0, 1 ).  This  choice  turns  out  to  be  convenient  and  also  consistent 
with  interval  analysis.  Taking  u  as  [0,  2],  fuzzy  conditionals  are  interval -valued  fuzzy 
sets. 

Before  proceeding  further,  let  us  specify  the  (Lukasiewicz)  logical  operations  on  the 
space  [0,  i]  u  {[0,  /]},  where  real  numbers  x  in  [0,1]  are  considered  as  intervals  [x,x]. 
Two  intervals  [aj,  a j],  [bj,  bj\  in  [0, 1]  are  equal  if  and  only  if  Oj  =  bj  and  02  =  b^ 
The  logical  operations  on  [0, 1]  are 


x'  ~  1  -  x. 


x  A  y  =  min(x,  y). 
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xVy  =  max( xt  y). 

As  in  Interval  Analysis  (for  example,  Moore,  1966,  1979;  Alefeld  and  Herzberger,  1983), 
logical  operations  on  the  set  I([0,  7))  of  intervals  of  [0, 1]  are  set-extension  operations,  so 
that,  using  the  same  notation, 

[a,  b] '  =  [x' :  a  <  x  £  b)  =  [7  -  b,  1  -  a]  =  1  -  [a,  b] . 

Note  that  [a,  b]"  =  [a,  b\.  In  particular 

u'  =  [0,iy  =[0,n  =  u. 


[< a,b]\[c,d\  =  [x  Ay :  a<x<b,c<y<d], 

=  [a  Kc,b  hd\, 

\a,  b]  V  [c,  d\  =  [x  Vy ;  a<x< b,  c<y  <d] 

=  [a  V  c,  b  V  d\. 

Note  that  7  is  not  a  true  complement  (so  that  the  law  of  excluded  middle  does  not 
hold)  since,  in  general. 


[a,  bj'  A  [a,  b]  &  0, 

[a,bYV[a,b)*l. 

However,  it  is  easy  to  check  that  DeMorgan’s  laws  do  hold,  that  is, 

({a,b]Mc,d\Y  =[a,b]'  V[c,d}', 

{[a,  b]  V  [c,  d])'  =  [a,  b]'  V  [c,  d}'. 

Moreover,  both  A  and  V  are  commutative  and  associative.  Also,  the  following 
distributive  laws  hold: 

[a,  b]  A  ([c,  d\  V  [e,f])  =  ([a,  b)  A  [ c ,  d\)  V  ([a,  b ]  A  [e,f\) , 

[a,  b]  V  ([c,  d]  V  [e,f])  =  ([a  ,b]  V  [c,  d\)  A  ((a,  b)  V  [e,f\). 

This  last  fact  follows  by  the  distributivity  of  A  over  V  on  real  numbers.  Finally,  the 
order  relation  on  I([0,  7))  is  defined  by  setting  [a,  b\  £  [c,  d]  if  and  only  if 
[a,  b ]  =  [a,  b)  A  [c,  dj,  which  is  the  same  as  a<c  and  b  £ d.  The  smallest  and  greatest 
elements  of  7([0,  7])  are  [0, 0]  =  0,  (7,  7]  =  7,  respectively. 
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The  logical  operations  on  [0,  2]  u  {[0, 1]}  are  the  restrictions  of  the  above  operations 
on  2([0,  l ])  to  its  subset  [ 0 , 1]  u  {[0, 1]}.  Thus,  for  example,  x  e  [0, 1],  we  have 

x  A  [0,  7]  =  [x,  x]  A  [0,  i]  =  [0,  x], 

1  v  [0, 1)  =  1,  [ 0 , 1)  A  [0, 1}  =  [0, 1],- 


a:  V  (y  A  [0,  2])  =  [x,  x  V  y], ... 


In  particular,  these  restrictions  to  {0, 2,  u},  with  u  =  [0, 1],  form  a  Lukasiewicz 
three-valued  logic. 

In  the  sequel,  u  =  [0,  2].  For  f,  g  :X -*  [0,  2],  define 


g  o 


.(*)  = 


•^(x)  when  g(x)  *  0 

o 


0  when  g(x)  =  0 

and  let  2,  denote  the  indicator  function  of  the  set  {x  :  g(x)  £  0}.  We  can  write 
(gwj 

(f\g)(x)  =  (^)0(x)l ^^(x)  V  (2^_^(x)  A  u ) . 

Note  that  can  be  replaced  by  (/Ag)/CgV2^=0))  in  the  above  equality.  Also, 

multiplication  •  on  2([0,  2])  is  the  set-extension  operation  of  •  on  real  numbers,  so 
that,  for  x  6  [0,  2], 

x-u-x- [0, 1]  =  {xy  :ye  [0,  2]}  =  [0,  x]  =  x  A  [0,  2]  =  *  A  u  . 

Thus,  it  is  convenient  to  use  the  form 

(f\g)  =  F  \l  Gu  =  [F,  F  V  G] 
for  membership  functions  of  fuzzy  conditionals,  where 

F  =  f^Vf{#or  G  =  1(g=oy 

Note  that  G  takes  only  values  in  {0,  2},  and  if  G(x)  =  1,  then  F(x)  =  0. 


Theorem  1.  Let  fp  gpf2>  g2  e  ?(X).  Then  (fj\gj)  =  (f2\g2 )  if  and  only  if  there  is  a 
positive  function  K  on  X  such  that  gj  =  Kg2  and  fj  hgj  =  X(f2  A  g2)- 
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Proof.  Fir  sufficiency,  let  K  be  a  positive  function  on  X  such  that  gj  =  Kg2  and 
fj  hgj=  K(f2  A  g2)>  Fr°m  8 1  ~  K82  and  K  >  0>  we  see  ^  8j  =  0  if  and  only  if 
g2  =  0,  so  that 

(fj  Uj)  =  (f2U2>  on  U;  =  0)  =  (g2  =  0). 

Next,  on  (gj*0)  =  (g2*  Oj, 

f  1^8 1 

(f1\81)W  =  jj1<t) 

K(x)(f2(x)hg2(x)) 


82M~ 


M 


*2 

=  (f2U2)(x)  • 

For  necessity,  suppose  that  (fj  |gj)  =  (f2  |g2).  Define 


K(x)  = 


8i(x) 

sjw  m  (S2*0)  =  (s1*0) 


'  c  >  0  on  (g2  =  0)  =  (gj  =  0) 
We  then  have  gj  =  Kg2  on  X  and 

fjhg]  /2a^2 

=  on  (g1*0)  =  (g2*0 ) 
8i  82  1  A 

implies  that 

flhgl  =  ^2h  h)  ~  K^2  A  ' 
On  (gj  =  0)  =  (g2  =  0),  we  always  have 

8j  h  8 j  ~  K(f2  A  £2)  • 


As  in  the  case  of  fuzzy  sets,  let  us  specify  the  syntax  representation  of  fuzzy  sets,  let  us 
specify  the  syntax  representation  of  fuzzy  conditionals.  First,  let’s  look  again  at  conditional 
events.  For  a,  b  e  X ),  we  have  seen  in  Chapter  3  that  ( a\b )  is  equal  to  the  interval 
[ab,b-ia]  in  ^(X),  where  b-*a  =  b'Va  (material  implication).  Thus,  (a  |  b)  is 
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equivalent  to  { ab ,  b  a}  or  to  {ab,  b  -+  a,  X}.  A  -  [ab,  b  -*a,X}  can  be  viewed  as  a 
finite  flou  set  with  characteristic  function  of  the  form 


1  if  x  sab 

Y  (r)W  =  <  0  if  xe  a'b 
t  if  xe  b' 


for  some  t  e  [0, 1 ],  where  in  flou  set  form, 

A®  =  [Ag  a  e  [0, 2]) 

with  Aq  =  X,  Aa  =  a  V  b'  for  0  <  a  <  t,  and  Aa  =  ab  for  t  <  a  <  1. 
For  each  t  e  [0,  2],  define  (pt(a  |  b)  :X  {0, 1,  r)  by 


( 1  if  xeab 
<pr(a  |b)0e)  =  |  0  if  xe  a'b 
(  t  if  xe  b' 

then,  since  u  =  [0, 1],  we  have,  for  xeX, 


Mb)(.x)={<pt(a\b)(x):t^[0,l]). 

That  is,  the  generalized  indicator  function  (p(a\b )  is  precisely  the  collection  of  real-valued 
functions  (pt(a ,  b),  t  e  [0,  2]. 

The  situation  is  similar  in  the  general  case.  Let  /,  g  e  ^(X).  Define,  (f\g)t :  X  -*  [0, 
2]  for  each  t  e  [0,  2]  by 


(f\g)t(x)  =  < 


fM. 


g 


(x)  when  g(x)  *  0 


t  when  g(x)  =  0 


Then  (f\g)  =  {(f|g)r  and  re  [0,1]}.  Let  A®  be  the  flou  set  associated  with  (f\g)t- 

Then  the  syntax  representation  of  <f\g )  is  the  family  of  flou  sets  { A ^  ;  t  e  [0,  2]}. 

We  turn  now  to  logical  operators  among  fuzzy  conditionals.  Since  fuzzy  conditionals 
are  interval-valued  fuzzy  sets,  operations  among  them  are  defined  pointwise,  that  is,  by 
logical  operations  on  I([0,  2]).  First, 


Ol£)'(x)  =  2  -  (/]£)(*)  = 


1  (x)  on  (g  *  0) 

2  -  [0,  1]  =  [0, 1]  on  (g  =  0). 
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Now,  . 


;  f(x)Sg(x)  _  g(x)-f(x)hg(x)  _  ((g(x)-f(x))VO)hf>(x) 
g(x)  g(x)  gfxT  • 

Thus  (f\g)'  =  ((g  ~f)  V  0\g).  The  situation  for  A  and  V  is  not  that  simple,  in  the  sense 
that  compound  fuzzy  conditionals  are  arbitrary  interval-valued  fuzzy  sets.  Using  interval 
representations, 

(fj\gj)  A  (f2\g2 )  =  \FV  F j  V  Gj]  A  [F2,F2  V  GJ 

=  [Fj  A  F2,  (Fj  V  Gj )  A  (F2  V  G2)] 

=  (F;  A  F2)  V  (((Fj  V  Gj)  A  (F2  V  G2))k), 
and 

<fj \gj)  V  (f2 1 g2)  =  [Fj  V  F2,  (F;  V  G;)  V  (F2  V  G2)] 

=  (F2  V  F2)  V  (((Fi  V  Gj)  V  (F2  V  G2))u). 

Thus,  compound  fuzzy  conditionals  are  of  the  form  fVgu=\f,fVg]  with 
f,g  :X-t  [0, 1].  Simple  fuzzy  conditionals  are  of  very  special  form,  namely  g  takes  only 
values  in  { 0,1 },  and  when  g  =  2,  we  have  /=  0.  However, 

Theorem  2.  If  f,g  :X-*  [0, 1],  then  fVgu  =  cc-(e\h)  V  /3  where  a,  j3,  e,  h  :X  [0,  2]. 
Proof.  Let 

a(x)  =  g(x)  V  2^=0)(x). 

Then 

fVgu=  a(£y&  u) 

=  u&ig^O)  V  V  7(g*0)w) 

=  °4  2(g=0)  v  ;(g*0)u)  V/'J(g^) 

=  D 

Remark. 

An  alternative  approach  to  defining  logical  operations  among  fuzzy  conditionals  is  this. 
Instead  of  using  arithmetic  of  intervals,  we  will  explore  the  connection  between  fuzzy  sets 
and  random  sets.  Let  f-,  g ■ :  X  -» [0, 1 ],  i  =  1,2  with  corresponding  uniform  random 
variables  U-  and  V.,  respectively,  all  defined  on  a  probability  space  (Q,  ,  P).  Let  F 

be  the  joint  distribution  function  of  ( U j,  V j,  U2,  V2),  that  is  F  is  a  4-copula.  Let  *  be 
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a  binary  operator  on  ji  j  jC,  for  example,  conjunction  or  disjunction.  The  corresponding 
operator  among  fuzzy  conditionals  is  determined  by 

(ffj  Ui)*ff2U2))(x)  =  P((fl|«*(c|d» 

where 

a  =  U'/lO.fjix)],  b  =  r/[0,  gj(x)l 
C  =  ir22lOtf2Qc)},  d  =  V'2\ 0,  g2(x)]. 

Now  (a|b)*(c|d)  =  (a|J3),  say,  so  that 


«fJl*J)*(/'2U2))«  = 

in  which  P(a/3)  and  P(j3)  can  be  computed  in  terms  of  F,  the  /Vs,  g  Vs,  and  x.  If  we  let 
P(ap)  =  h(x),  P(j3)  =  Kx),  then  for  xeX, 

i(f1\s1)*(f2\s2))^  =  W(x). 

To  illustrate  this  approach,  consider  negation  and  conjunction  in  the  case  where  F  is 
min.  The  situation  for  negation  is  simple,  involving  only  a  unary  operator.  Let 

f,  g  :X  - 1  [0, 1 ]  with  corresponding  U,  V.  Let  a  =  U~\o,f{x)),  b  =  v\o,  g(x)]  for  an 
arbitrary  reX.  Then  (a\b)'  =  (a' \b)  and  P((a\b)')  =  P(a'  |  b)  =  1  -  P{a\b),  so  that 

<f\gY  =  1  -  (f\g). 

For  conjunction  with  F  =  min,  using  the  same  notation  in  the  procedure  described 
above,  we  have 


(a\b)(c\d)  =  ( ab\a'b  V  c'd  V  bd), 
and 

P{abcd)  =  min{fj(x),  gft),f2(x),  g2(x)}  , 
P(a'b  V  c'd  V  bd)  =  Pia'b  V  c'd  V  abed) 

=  Pifl'b  V  c'd)  +  P(abcd), 

P(a'b  V  c'd)  =  P(a'fc)  +  P(c'd)  -  P(a'bc'd), 

P(a'b)  =  P{b)  ■  P{ab) 

=  gfit)  -  min[fj(x),  gfi)), 

P(c'd)  =  g2(x)  -  min[f2(x),  gfe)}, 
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P(a'bc'd)  =  P(bd{a  V  c)') 

=  P{bd)  -  P(gbd  V  cbd) 

=  P(bd)  -  {abd)  -  P(cbd)  +  P(abcd) 

=  min[gj(x),  gfi)}  -  min[ffi),  gfi),  g2{x)}  -  min{g1 (x),ffi),  gfi)}  +  P(abcd). 
Thus, 

P(a'b  V  c'd  V  bd)  =  gj(x)  -  min[fj(x),  gj(x)}  +  g2(x)  -  min[f2(x),  g2(x)} 

-  min[g j(x),  g2(x)}  +  min[fj(x),  gfi),  g2(x)  +  min{gj(x),  f2(x),  g2(x)}. 

Therefore, 

((fjlSj)  A(f2U2))W  =  (/i|4)W, 

where 

h(x)  =  min{f}(x),  gj(x),f2(x),  g2(x)}  , 
l(x)  =  gfi)  -  min[fj(x),  gj(x)}  +  g2(x)  -  min{f2(x),  g2(x)} 

-  min[gj(x),  g2(x)}  +  min{fj(x),  gfi),  gfi))  +  min{gfi),ffi),  gfi)}. 

7.5  Probability  qualification 

If  we  view  fuzzy  conditionals  as  uncertain  rules  in  expert  systems,  then,  according  to 
fuzzy  logic  (Zadeh,  1988),  there  are  three  possible  modes  of  qualification  of  these  rules, 
truth-qualification,  probability-qualification,  and  possibility-qualification.  In  this  section, 
we  address  only  the  numerical  aspect  of  probability  qualification  of  fuzzy  conditionals;  we 
lay  down  the  mathematical  framework  for  semantic  evaluations  of  fuzzy  conditionals  in 
Probability  Logic.  Other  modes  of  qualification  as  well  as  fuzzy  probabilities  are  not 
treated  here. 

Let  ( X ,  R)  be  a  measurable  space.  At  the  semantic  level,  following  Zadeh  (1968),  a 
fuzzy  event  is  defined  to  be  a  measurable  map  from  X  to  [0,  2]  (where  [0, 1]  is 
regarded  as  a  measurable  space  with  its  induced  Borel  cr-field).  A  probability  measure  P 
on  (X,  R)  is  viewed  as  a  model,  and  ||  •  ||p,  denotes  the  semantic  evaluation  map  in  the 
model  P.  Thus,  if  /  is  a  fuzzy  event,  then  ,  as  proposed  by  Zadeh  (1968),  \\f\\p  is 
defined  as  follows.  Let  £  be  a  random  variable  with  values  in  X,  having  P  as  its 
probability  law. 
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ll/1l?  =  EjJ®  =  [  f{x)dP(xf 

Next,  we  look  at  the  case  of  ordinary  conditional  events.  For  a,  b  e  R,  the  syntax  part 
(a  |  £>)  was  derived  (Chapter  2)  in  a  compatible  manner  with  conditional  probability.  That 
is,  if  P  is  a  probability  on  R,  then  P((a\b))  =  P(qb)fP(b),  when  P(b)>0,  is 
well-defined.  Thus,  the  probability  evaluation  of  its  "generalized"  indicator  function  (or  its 
semantic  part)  (p(a\b )  is  taken  to  be  Pifl\b\  that  is, 

This  evaluation  of  <p(a  |  b)  with  respect  to  a  model  P  is  sometimes  referred  to  as  a  third 
value  for  <p{a\b).  See  Chapter  5,  also  Coletti  et  al.,  1990.  Now,  with  the  notation  of 
Section  7.4, 


(p(a\b)  =  {(pt(a\b) :  t  e  [0, 2]}, 

and  Ep(pt(a\b)(E,)  =  P(ab)  +  tP(b').  It  is  easy  to  check  that  P(a  |  b)  is  iht  fixed  point  of 
the  map 


te  [0, 1]  Ep<pt(a\b)(& 

This  observation  suggests  an  extension  of  Zadeh's  concept  of  probabilities  of  fuzzy  events 
(Zadeh,  1968)  to  the  case  of  probabilities  of  fuzzy  conditional  events.  Specifically,  let  /,  g 
be  two  fuzzy  events.  From  Section  7.4,  we  have 

Define  ||(/l£)||p  to  be  the  fixed  point  of  the  map  / -t  £p(f  jg),©.  Then 

Ep(f\g)t®  =  EpW\s\®\g®  >  Wg©  >  0) 

+  Ep((f\g)t®  |S©  =  0)”(g©  =  0 ) 

=  EP^©|S©  >  0)P{g®  >  0)  +  (Pfe©  =  0). 

Thus  the  fixed  point  is 

Epf-fi®  Is©  >  0) , 
il(f|s)||p  =  i^©ls©  >  0)  • 


and 
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Obviously,  this  evaluation  generalizes  those  in  the  two  previous  special  cases.  Since 

(f|£)  =  o2(g>0 )  V  2(g=0)u ' 

(f\g)  takes  values  in  [0,1]  on  (g  >  0)  (on  (g  =  0),  (/jg)  =  u),  namely  This 

observation  is  used  to  define  evaluation  of  compound  fuzzy  conditionals  as  follows. 

A  compound  fuzzy  conditional  is  of  the  form  /V  gu  where  f,g:X->  [0,  2].  Since 
/V  gu  =  1/,/V  g],  we  see  that  / V  gu  takes  values  in  [ 0 , 2]  only  on  (g  </),  that  is,  x  e  (g 
<j)  if  and.  only  if  if  V  gu)(x)  =  fix)  s  [( 0 ,  2]  .  Thus,  by  analogy  with  the  simple  fuzzy 
conditionals  case,  we  define 

\\fVgu\\p  =  Ep(f\g<f). 

This  evaluation  is  well-defined,  since  if.  fV  gu  =  hV  ku  then  \f,f  V  g]  =  [h,  h  V  k],  Thus 
/  =  h,f  V  g  =  h  V  k,  and  (g<f)  =  (k<h ).  Hence  Ep(f\g  <f)  =  Ep(h\k  <  h). 

7.6  Iterated  fuzzy  conditionals 

The  topic  of  iterated  conditioning  will  be  treated  in  Section  8.1  of  Chapter  8,  from  a 
syntactic  viewpoint.  Here,  to  be  complete,  we  discuss  this  concept  in  the  setting  of  fuzzy 
sets,  but  from  a  semantic  viewpoint,  that  is,  using  generalized  indicator  functions  of 
conditional  objects  rather  than  the  objects  themselves.  Let  R  be  a  field  of  subsets  of  a  set 
X.  For  a,  be  R,  the  generalized  indicator  function  of  (a\b)  is  defined  as 

<p{a\b):X->I{[0, 1]), 

where  2([0,  2])  denotes  the  set  of  all  closed  sub-intervals  of  [0,  2],  equipped  with 
arithmetic  of  intervals,  and 

’2  for  x  e  ab 

<p(a|&)(;t)  =  <  0  for  x  e  a'b 

u  =  [0,1],  for  x  e  b'  . 

<p(a\b)  is  a  special  fuzzy  conditional,  since  (p(a\b)  =  (2fl|2p  =  2fl  A  2^  =  2 ^  on  b 
and  is  u  on  b'  .  Also,  if 

12  for  x  e  ab 
0  for  x  e  a'b 
t  for  .xe  b' 


then 


<P(a\b)  =  [(pfa\b) :  re  [0,1]]. 
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Each  (f>t(a\b)  can  be  viewed  as  an  element  in  <p(a\b).  Observe  that 
((>fa\b)-l^  =  1  a’^b'  f°r  r  e  [0,  !]■  Thus,  as  a  natural  approximation,  we  can  view 
<p(a\b)  as 


In  a  similar  way,  for  /,  g  :X  -+  [i 0 , 1 ]  one  can  approximate  a  fuzzy  conditional  (f\g)  as 

{h:X->[0,l]:h-g=f*g}. 


The  above  heuristic  considerations  lead  to  an  approximate  form  of  iterated  fuzzy 
conditionals.  For  f.,  g . :  X  ->  [0,  2],  i  =  1,  2,  define 

W, Is,) | (f,  1$,))=  u  (01«)  -  cr|s)(f2ls2)  =  Cf,|g,)Cf2ls2)}. 
f.s 

where  operations  among  fuzzy  conditionals  are  those  in  J([0,  i]).  Note  that  by  a  union  of 
the  form  u  {(f|g)},  we  mean  the  union  of  set  (flgXx)  which  are  either  [t],  for  some 

f>8 

t  e  [0, 1],  or  [0, 1],  for  each  xeX.  In  other  words,  V  is  a  map  from  X  to  S °[0, 1],  The 
main  result  of  this  section  is  the  proof  of  the  fact  that  V  is  an  operator  on  the  space  of 
fuzzy  conditionals. 

For  this  purpose,  we  proceed  as  follows.  Consider  the  equation 


0) 


Let  h  =^-  on  (g  >  0),  we  write 
© 


(/Is)  -  hl(g>0) v  ul{g=oy 


f^Si 


Similarly,  let  hi  =  —g—  on  (gi  >  0),  i  =  1,  2.  The  equation  (1)  is  rewritten  as 

ihl(g>0)  7  ul(g=0)*h2l(g2>0)  7  Ml(g2=0)}  = 

(V(g7>0)  7  ul{gj=0)){h2l{g2>0)  7  ul(g2=0)}- 
After  multiplying  out  terms,  we  get 


(2) 


a  V  (iu  =  y  v  fyi, 


(3) 


where 
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“  -  hl(g>o)h2l(z2>oy 

v  =  hl(g>0)1(s2=°) v  *21fe2*»1fe=o> v  1(g=o)l(g2=oy 
y=h  l(gj>o)h2l(g2>oy 

5  =  ki1(g1>0)l(g2=0> v  h2l(g2>0)l(Sl=0) v  l(g^o)l(g2=oy 

Since  (3)  is  precisely 

[a,  a  v  p\  =  [y,  yv 

we  have 

c£  =  y -and  aV| 3=yv£.  (4) 

To  solve  (4),  we  consider  the  partition  of  X  consisting  of  (g2  =  0),  C?2  >  0,  ^  >  0) 

(°2  >0,h2  =  0 ). 

On  (g2  =  0),  (4)  becomes 

W(g>0)  V  1(g=0  =  V  (5) 

Thus, 

V\s)  =  AI(J>0)  v  “1(g=0) 

=  (/!/1(gJ>0)  v  v  “1(«-*i)si)=0)  <6) 

since  on  (g  >  0),  (5)  yields 

h  =  ^l^{gj>0)  V  1(gi=0)’ 

and  on  (g  =  0), 

J  =  hll(g1>0)  v  1(gj=0)- 

This  is  equivalent  to  hj  =  1  or  gj  =  0,  that  is,  to 

(i  -  AjXty  =  0. 

On  (g2  >  0,  h2  >  0),  we  have  from  (4)  that 


hl(g>0)-hll(gj>0)’ 
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and 

hl\1>0)  v  1(s=0)  =  V( gj>0)  V  ‘^=0)'  '  TO 

From  (7)  we  see  that  (g  =  0)  =  (gj  =  0).  Indeed,  if  gj(x)  =  0,  then  g(x)  =  0. 
Conversely,  if  g(x)  =  0,  then  either  gj(x)  =  0  or  hj(x)  =  0.  But  the  case  where 
g 2 (?)  >  0  and  hj(x)  =  0  is  impossible  in  view  of  (7). 

Next,  on  (g  •>  0),  we  have  h  =  hj.  Thus 

Olg)  =  ^(£>0)  v  ul(g=0)  =  V(g;>0)  v  ul(^i=0)  = 

On  >  0)n  =  0),  (4)  supplies  no  constraint  on  /  and  g,  so  that  (f|£)  is  a 

solution.  But  for  xeX, 

u  ms)(x)}  =  [0,l]  =  u. 
f,S 

Thus,  we  have 

Theorem  1.  For  g. :  X  [0, 1],  i  =  1, 2, 

KWK&M  =  ((^1(^,1 10*09  • 

where 

C  =  *7*2*2  V  = 
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so  that 

VmmWW  =  (Xabcd\lbfc-d-vcd))  =  'P(.rtcd\b(a'd'Vcd» 
=  (p(ab\b(a'd'  V  cd)). 


This  should  be  compared  with  Theorem  3  of  Section  8.1  of  Chapter  8. 
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CHAPTER  8 

ITERATED  CONDITIONING  AND  MISCELLANEOUS  ISSUES 

This  last  chapter  is  concerned  with  some  topics  related  to  measure-free  conditioning. 
In  Section  8.1,  an  investigation  of  iterated  conditioning  is  carried  out.  In  Section  8 2, 
some  aspects  of  non-monotonic  logic  on  conditionals  are  discussed.  In  Section  8.3,  we 
generalize  some  of  the  results  concerning  operations  on  cosets  of  Booleans  rings  to 
commutative  von  Neumann  regular  rings.  Finally,  in  Section  8.4,  we  close  by  suggesting 
open  problems  for  future  research. 

8.1  Iterated  conditioning 

In  Section  7.6,  we  have  touched  upon  the  concept  of  conditionals  of  conditionals  in 
the  fuzzy  case.  In  this  section,  we  return  to  the  Boolean  case  and  formulate  the  basic 
concepts  of  higher-order  conditioning.  This  investigation  of  iterated  conditioning  is  a  first 
attempt.  We  hope  that  this  will  trigger  further  work  in  this  area. 

By  Lewis'  Triviality  Result  (Chapter  1),  there  is  no  binary  operation  0  on  a 
Boolean  ring  that  is  compatible  with  conditional  probabilities.  That  is,  there  is  no  binary 
operation  0  on  R  such  that  for  all  a,  b  e  R,  and  all  probability  measures  P  on  R  such 
that  P(b)  &  0,  the  equation 


P(a  0b)  =  Pia.  j  b)  =  P(ab)lP(b). 

holds.  Thus,  to  define  conditional  events  compatible  with  probabilites,  one  is  forced  to  go 
outside  R,  and  we  enlarged  R  to  R\R  for  that  purpose.  Now,  having  the  conditional 
space  R\R,  we  wish  to  consider  conditionals  on  it  But,  again,  R\R  will  not  accomodate 
conditionals  between  its  elements  that  are  compatible  with  probability.  More  precisely, 
the  situation  is  this. 

Theorem  1.  (77ze  Triviality  Result  for  R|R)  Let  R  be  a  Boolean  algebra  with  at  least 
sixteen  elements.  Then  there  does  not  exist  a  binary  operation  0  on  R\R  such  that 

P((a\b)  0  (c\d>)  = 

P(c\d) 

for  all  a,  b,  c,  de  R  and  for  all  probability  measures  P  on  R  such  that 
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P(b)*0*P(c\d). 

Proof.  If  R  is  not  atomic,  then  there  exist  four  mutually  disjoint  non-zero  elements 
of  R.  Just  take  a  e  R  where  a  has  no  atom,  and  let  a>b>  c>  d,  all  non-zero.  Then 
d,  cd',  be' ,  and  ab'  are  four  such  elements  of  R.  If  R  is  atomic,  let  a  be  an  atom  of 
R,b  be  an  atom  of  a',  c  be  an  atom  of  ( a  V b)',  and  d  be  an  atom  of  (flV&V  c)'. 
This  is  possible  else  R  has  fewer  than  sixteen  elements.  Thus,  in  any  case,  R  has  four 
mutually  disjoint  non-zero  elements  r,  s,  t,  and  u.  Now,  by  the  Stone  Representation 
Theorem,  R  is  a  subalgebra  of  ^(Q)  for  some  set  fl,  and  so  viewing  R,  let  v,  w,  x, 
and  y  be  elements  of  r,  s,  r,  and  u  respectively.  Define  a  probability  measure  P  on 
via 


P(v)  =  0.1,  P(w)  =  P(x)  =  P(y)  =  03. 

Then,  P  is  a  probability  measure  on  R.  Let 

a  =  r  V  s, 
b  =  rV  sV  t, 
c  =  rV  t, 
d  =  rVsVtSu. 


A  solution 


so  that 


yields 


(x|y)  =  (a\b)  0  (cjd) 


PMy)  =  P((a\b))(c\d)) 
P(c\d) 


P(zjy)  =  P(ac\a'bVc'(Nbd) 

P(c\d) 

P(ac)P(d) 

P(a'bVc’cNbd)P(c) 

(O.I)(I) 

P(N(rV  t  )V(rWt))(0.6) 

0.1  _P(x) 

( 0.7)(0.6 )  P(y) ' 

taking  x  <,  y.  But  there  is  not  such  a  pair  x,yeR.  d 
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There  are  some  special  cases  for  which  solutions  exist. 

(i)  For  b  =  d=  1,  we  have  (x|y)  =  (a|c)  e  R\R. 

(ii)  More  generally,  for  b  =  d,  we  have 

P[(a\b)(c\b)]/P(c\b)  =  P(ac\b)IP(c\b)  =  P(abc)IP(bc)  =  P(a\bc), 

so  that  a  solution  (x|y)  is  (a  |  be). 

(iii)  Generalizing  in  a  different  direction,  letting  only  d  =  i,  we  have 

P[(a\b)c]fP(c)  =  P(flbc\b  V  c')/P(c)  =  P(ac)IP(c) 

if  c  £  b,  so  that  when  c<  b,  ((a|b)|c)  =  (n|c).  In  particular,  ((a |b)|b)  =  (a |b). 

The  interpretation  of  all  the  above  is  plausible  from  a  rule  deduction  viewpoint 
(See  Dubois  and  Prade,  1990,  and  also  Calabrese,  1987).  For  iterated  conditionals  of 
conditionals  with  the  same  antecedent  (that  is,  b-  d),  see  also  Pfanzagl  (1971,  p.  200).  In 
this  case,  for  fixed  b  e  R,  iterated  conditionals  of  the  form  ((a  |  £>)  |  (c  |  b))  are  nothing 
more  than  conditionals  on  the  (quotient)  Boolean  ring  R\Rb\  The  operations  on  the  ring 
R/Rb'  are 


(a\b)  +  (c\b)  =  (a  +  c\b), 
(a\b)'(c\b)  =  (ac\b), 
(a\bY  =  (a' \b). 

Thus 


((a\b)\(c\b))  =  (a\b)  h-  (R/Rb')(c\b)'  6  WIRb'WIRb'){c' \b)). 

We  are  going  to  show  that  ((a  |  b)  |  (c  |  b))  can  be  identified  with  (a  j  be)  e  R  | R(bc) ' .  For 
this  purpose,  consider  the  map 


defined  by 


X\R\Rb'  -*R\R(bc)' 


X(x  +  Rb')  =  x  +  R(bc)'. 

First,  this  map  is  well-defined.  Indeed,  changing  x  to  x  +  rb',  the  image  under  X  is 
x  +  rb'  +  R(bc)'.  But  rb'  <  b'  V  c'  so  that  rb'  e  R(bc)' ,  that  is,  rb'  +  R{bc)'  = 
R{bc)' .  It  is  obvious  that  A  is  a  ring  homomorphism  and  is  onto.  It  remains  to  verify 
that  the  kernel  of  X  is  precisely  the  principal  ideal  ( R\Rb'){c'b )  of  R\Rb'.  We  have 
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Mix  +  +  Rb'))  =  Mxc'  +  Rb')  =  xc'  +  R(bc)'  =  R{bc)' 

(since  xc'  £  (be) '),  which  is  the  zero  in  R/R(bc)'.  Thus,  ((R/Rb')/(R/Rb')(c' \b))  is 
isomorphic  to  R/R(bc)'.  o 

However,  the  identification  of  ((a  |  h)  |  (c  |  b))  with  (a  |  be)  does  nothing  toward 
getting  a  general  definition  of  conditionals  for  conditionals.  Of  course,  one  can  argue 
from  some  logical  viewpoint,  and  then  define,  in  an  ad-hoc  or  plausible  manner,  an 
iterated  conditional  (a|h)  |  (c|d)  in  such  a  way  that  the  above  intuitive  (and  compatible) 
special  cases  hold. 

Our  approach  here  is  this.  We  cannot  proceed  in  exactly  the  same  manner  as  we  did 
to  get  R\R  from  R.  The  space  i?|7f  consists  of  all  cosets  of  all  principal  ideals  of  R. 
The  space  i?|i?  is  not  even  a  ring,  and  so  we  cannot  make  a  totally  analogous 
construction.  However,  in  If,  a  coset  a  +  Rb'  =  [a  +  rb'  :  r  6  R)  is- the  set  of  all 
solutions  x  to  the  equation  xb  =  ab.  In  R\R,  we  can  carry  out  the  construction 
analogous  to  that  So  we  are  led  to  the  following  definition. 

Definition  1.  Tor  (a\b) ,  (c\d)  e  R\R,  the  iterated  conditional  ((a|h)|(c|d))  is  the  set 

{(x|y) :  (x|y)(c|4)  =  (a|h)(cjd)}. 

The  collection  of  these  sets  is  denoted  (R  |i?)|(/?|if)  and  is  called  the  space  of  iterated 
conditionals. 

Now  ((fl|h)|(c|d))  is  not  empty  since  it  contains  (a\b)  as  well  as  ((a\b)(c\d)). 
In  the  case  of  ordinary  events,  the  set  {x  :  xb  =  ab)  is  the  interval  [ab,  a  V  b'].  That  is, 
solutions'  x  to  the  equation  xb-ab  are  exactly  those  x  such  that  ab  <,x  £  a  V  b' .  So 
a  conditional  event  is  also  an  interval  in  R.  This  was  discussed  in  Chapter  2.  One  might 
expect  that  ((a|h)|(c|d))  is  an  interval  in  R\R  under  the  partial  order  we  defined  by 
(a | b)  <i  (c | d)  if  (a | b)  =  (a | b){c | d).  In  fact,  R\R  is  a  pseudo-complemented  lattice  with 
respect  to  this  order,  as  expounded  upon  in  Chapter  4.  Now  ((a|b)|(c|d))  does  have  a 
smallest  element,  namely  (a\b)(c\d).  Furthermore,  this  is  the  counterpart  to  the  smallest 
element  ab  in  the  interval  [ab,  b'  V  a).  However,  various  counterparts  to  b'  V  a  = 
b  -» a  (material  implication),  such  as  (a\b)  V  (c\d)'  and  Lukasiewicz’s  implication  are 
not  solutions  to  (x|y)(e|d)  =  {a\b){c\d),  that  is,  are  not  in  ((a|h)|(c|d)).  However,  R\R 
has  a  property  that  we  have  not  yet  exploited.  It  is  relatively  pseudo-complemented.  It 
turns  out  that  b  -*  a  is  a  relative  pseudo-complement  in  R  of  b  with  respect  to  a 
since  x  £  b'  V  a  if  and  only  if  xb  <  a.  (See  the  definition  below.)  So  there  is  another 
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counterpart  in  R\R  to  b'  V  a,  and  it  is  that  element  that  is  a  maximum  solution  to 
(x|y)(c|d)  =  (a\b)(c\d)  and  guarantees  that  ((a|6)|(c|d))  is  indeed  an  interval  in  R\R. 

Definition  2.  A  lattice  L  is  relatively  pseudo-complemented  if  for  every  a,  b  e  L,  there 
is  an  element  a*b  e  L  with  the  property  that  x  £  a*b  if  and  only  if  a  A  x  £  6. 

Clearly  there  is  only  one  such  a*bt  and  it  is  called  the  pseudo-complement  of  a 
relative  to  b.  The  element  a*b  satisfies  a  A  a*b  £  b,  and  is  the  supremum  of  the  set  of 
all  such  elements.  The  relative  pseudo-complement  of  a  with  respect  to  0  is  called  the 
pseudo-complement  of  a ,  and  that  notion  played  an  important  role  in  Chapter  4. 

The  relevance  of  relative  pseudo-complements  to  our  problem  is  this.  Suppose  that 
7?|/?  is  relatively  pseudo-complemented.  Then  applying  that  property  to  the  pair  of 
elements  (c | d)  and  (a | b)(c | d),  R \ R  has  an  element  e\f={c\ d)*((a \ b)(c | d))  such  that 
(e\f)(c\d)  £  (a\b)(c\d)  and  such  that  (x|y)  £  (e| f)  if  and  only  if  (x|y)(c|cQ  £ 
(a  |  b)(c  |  d).  But  there  are  solutions  to  (r|y)(c|d)  =  (a  |  b){c  |  d).  Hence  the 
pseudo-complement  (e| f)  of  (c\d)  relative  to  (a\b)(c\d)  satisfies  (e\f)(c\d)  = 
(a\b)(c\d).  Thus  if 


then 


Conversely,  if 


then 


(x|y)(cid)  =  (a|6)(c|d) 
(a\b)(c\d)  £  (x\y)  Z  (e\f). 
(a\b)(c\d)  <  (x\y)  Z  (e\f). 


(a\b)(c\d){c\d)  =  (a\b)(c\d)  <:  (x|y)(c|d)  <  (e\f)(c\d)  =  (a\b)(c\d), 

and  so 

(x|y)(c|d)  =  (fl|fe)(c|a). 

Therefore 


(a\b)\(c\d)  =  [{a\b){c\d),  (c\d)\(a\b)(c\d))). 

Thus  we  need  two  things.  We  need  that  R\R  is  relatively  pseudo-complemented,  and  we 
need  a  formula  for  the  relative  pseudo-complement  (c  |  d)*((a  |  b){c  |  dj). 
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Theorem  2.  R\R  is  relatively  pseudo-complemented,  and  the  pseudo-complement  of 
(, a\b )  relative  to  { c\d)  is 

{a\b)\c\d)  =  {cd  V  a'b  V  b'd'  | d  V  a'b  V  b'd') 

Proof.  Let  e  =  cdS  a'b  V  b'd',  and  /=  d  V  a'b  V  b'd'.  We  need  that  (a|&)(x|y) 
£(c|d)  if  and  only  if  (xjy)  £  (e\f).  Now  (a|6)(x|y)  £  {c\d)  if.and  only  if 


if  and  only  if 


and 


if  and  only  if 


and 


{ax\a'b  Vx’y  V  by)  £  (c|d) 


ax{a'b  V  3tySby)£cd 


c'd  £  {ax)' {a'b  V  x'y  V  by). 


abxy^cd 


c'd<,{a'  V  x'){a'b  V  x’y  V  by)  =  a'b  V  x’y. 


So  we  have  that  (a|6)(x|y)  <,  {c\d)  if  and  only  if 

abxy  <,  cd  and  c'd  £  a'b  V  x'y. 
Conversely,  (x|y)  <{e\f)  if  and  only  if 

xy  <  ef=  {cd  V  a'b  V  b'd'){d  V  a'b  V  b'd') 

=  cd\  a'b  V  b'd', 
and 

x'y  >  «'/=  {cd  V  a'ft  V  b'd')'{d  V  a'J>  V  b'd) 

=  (c'  V  )(a  V  &')(&  V  d)(d  V  a'&  V  b'd) 
=  {c'  V  d'){a  V  &')(<*  V  n't) 

=  (c'  V  d'){ad  V  b'd) 

=  (ac'tf  V  &'c'<i) 


=  c'£?(a  V  £') 

Thus  we  have  (x|y)<(e|/)  if  and  only  if 
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xy<cdV  a'b  V  b’d'  and  x'y  £  c'dia  V  b'). 

We  need 


xyZcdV  a' by  b'd'  and  x'y  £  c'dia  V  b') 


if  and  only 


abxy^cd  and  c'd^a'by  x'y. 
But  x'y  £  c'dia  V  b')  implies  that 

a'b  V  x'y  £  c'b  V  c'dia  V  6')  £  c'd, 


and  c'd  £  a'b  V  x!y  implies  that 

c'dia  V  b ')  £  (a'bVx’yXa  V  b')  =  Xy{a  V  b')  £x'y. 
From  xyZcdy  a'b  V  b'd'  we  get 

abxy  <:  ab(cd  V  a'b  V  b'd')  =  abed. 

Finally,  abxy  £  cd  and  c'd  <,  a' by  x’y  imply 


and 


xy  £  (i ab)'cd 

c'dip,  V  b’)  £  ia'b  V  ^yXa  V  b') 


=  y!yia  V  b')  £  x'y  <,  x'  V  y', 


from  which  we  get 
and 


Thus 


x'  Vy'  £ abic'  yd') 
x'  yy'  >  c'dia  yb'). 

x'  V  y'  >  abic'  V  d')  V  c'dia  V  b’). 


But  xy  <cdy  a'b  V  b'd'  is  equivalent  to 

x'  yy'  >ic'  y  d')ia  y  b')ib  y  d) 


=  abic'  V  d')  V  c'd(a  V  b’). 
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The  relative  pseudo-complement  (a|b)*(c|d)  can  be  written  in  an  apparently 
simpler  form,  namely 

(a|b)*(c|d)  =  (cdVa'bVb'd'|a'  Vb'  Vd). 

One  should  note  the  special  case  (c  |  d)  -  (0 1 1).  We  have 

(a\bnO\l)  =  (a\by  =  (a'b\l), 

the  pseudo-complement  in  R\R  of  (a|b). 

The  relative  pseudo-complement  of  (a\b)  with  respect  to  (c|d)  is  a  form  of  an 
"implication  operator" 

(a| b)  =»  (c|d)  =  (a\b)*(c\d)  =  (cd  V  a'b  V  b'd'  | a'  V  b'  V  d) 

on  R\R  extending  material  implication  on  R.  It  can  be  viewed  as  the  counterpart  of 
material  implication  in  R\R.  The  truth  table  of  (a|b)  4  (c|d)  follows.  Let x  =  (a|b) 
and  y  =  (c|d). 


x 4  y 


x\y 

Oli 

ili 

010 

0\1 

Oli 

Oli 

Oli 

ill 

ili 

ili 

ili 

010 

ili 

Oli 

ili 

Corollary  1.  ((a\b)\(c\d))  =  [(a |b)|(c|d),  (c\ d)*((a \ b)(c | d))] 

=  [(a\b)\(c\d),  (abed  V  c'd  V  ad'  V  b'd'  | b  V  c'd  V  ad'  V  b'd')]. 

Proof.  We  need  only  to  show  that 

(c\d)*((a\b)(c\d))  =  (abed  V  c'd  V  ad'  V  b'd' \b  V  c'd  V  ad'  V  b'd'). 

Let  e  =  a'b  V  c'd  V  bd.  By  the  formula  in  Theorem  2, 

(c|d)*((a|b)(c|d))  =  (c|d)*(ac|a'b  V  c'd  V  bd) 

=  (ace  V  c'd  V  d'e'  |e  V  c'd  V  d'e') 

=  (abed  V  c'd  V  ad'  V  b'd'  |a'b  V  c'd  V  bd  V  ad'V  b'd') 
=  (abed  V  c'd  V  ad'  V  b'd'  |b  V  c'd  V  ad'  V  b'd'). 


o 
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The  right  hand  endpoint 

:(c|d)*((a]  Wc|4))  =  (flbcd  V  c'd  V  ad'  V  b'd'  | b  V  c'd  V  ad'  V  fc'cT)  , 
can  also  be  written  in  the  somewhat  simpler  form 

(ab  V  c'd  V  ad'  V  b'd'  \b  V  c'  V  d')-- 

Corollary  1  gives  a  way  to  identify  R\R  with  a  subset  of  (P|P)|(P|i?).  An  easy 
calculation  shows  that  ((fl|h)|(2|7))  =  [(a  |  b),  (a|h)].  Thus  the  map 
(i a  j  b)  -*  ((a  \b)  |  ( 1 1 1))  is  one-to-one. 

Corollary  2.  ((a\b)\(c\d))  =  (P|P)((c|d)*((a|h)(c|d)))  V  (a\b)(c\d). 

Proof.  An  element  (x|y)  in  the  interval  [(<3 |  h)  |  (c  |  d),  (c  |  d)*((a  |  h)(c  |  ti))]  is 

(x|y)((c|d)*((a|b)(c|d)))  V  (a\b)(c\d), 

which  is  in 

(/?|i?)((c|d)*((a|h)(c|d)))  V  (a\b)(c\d). 

The  converse  is  equally  clear.  □ 

Now  ((a | h) | (c | if))  is  an  interval  in  R\R,  and  it  would  be  nice  to  have  simple 
criteria  for  the  equality  (( a  |  b)  |  (c  \  d))  =  ({e  \f)  |  (g  |  h)).  Two  conditional  events  ( a  |  b)  and 
(c  |  d))  are  equal  if  and  only  if  ab  =  cd  and  c  =  d.  The  analogous  condition  here  is  that 
(a\b)(c\d)  =  (fi\f)ig\h)  and  (c\d)  =  (g|h).  This  does  not  seem  to  be  the  case,  however, 
and  the  best  we  can  do  at  the  moment  is  to  say  that  (a | b)(c | d)  =  (e\f)(g\h),  and 
(c\d)*((a\b)(c\d))  =  (g\h)*((e\f)(g\h)),  that  is,  that  the  end  points  be  the  same.  For 
example,  there  does  not  seem  to  be  a  way  to  recover  (c\d)  from  (a\b)(c\d)  and 
(c  |  d)*((a  |  b){c  |  d)).  This  precludes  making  the  definition 

P((a\b)\(c\d))=P{(a\b)(c\d))IP(c\d) 

since  (c\d)  is  not  available.  However,  in  the  conditional  case, 

P(a\b)  =  P(ab)/P(b )  =  P(ab)/(1  +  P{ab)  -  P(a  V  b')). 

This  last  expression  affords  a  way  to  define  P  on  (i?|i?)|(/?|/?),  namely  by  the  equation 

P({a  |  b)  |  (c  |  d))  =  P((a  |  b)(c  \  d)W  -  P{a  \  b)(c  \  d)  +  P((c  |  |  b)(c  |  d)))). 
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Furthermore, 

m  I  b)  |  (1 1 1))  =  P((a  |  b)(l  1 1))/(1  -  P((a  |  b)(l  1 1))  +  P((l  |  i)*((a  |  b)(l  1 2)))) 

=  P(a\b)/(1-P(a\b)+P(a\b))  =  P(a\b), 

and  this  definition  extends  the  definition  of  P  on  R\R,  viewing  R\R  as  embedded  in 
(R\R)\(R\R)  by  (a\b)  ((a|Z>)|(2|2)). 

An  element  ((a ] |  (e  ]  J))  of  (/?|i?)|  (/?[/?)  contains  some  special  elements  besides 
its  endpoints  (a\b)(c\d)  and  (c|d)*((a|b)(c|<2)).  It  is  a  subset  of  R\R,  and  so  consists 
of  a  set  of  subsets  of  R.  As  the  latter,  its  point  set  union  can  be  taken,  yielding  a  subset 
of  R.  It  is  rather  remarkable  that  doing  so  yields  a  coset,  that  is,  an  element  of  R\R,  and 
moreover  that  coset  is  in  (n|2?)|(c|c0.  We  proceed  now  to  verify  all  this. 

Let  (c\d)*((a\b)(c\d))  =  (a\p).  Since  (a\b)(c\d)  =  (ac\a'b  V  c'd  V  bd),  we  have 

a  =  abed  v  y 

[}  =  ( a'b  V  c'd  V  fed)  V  y 

where 

y  =  (a'b  V  c'd  V  bd)'d'  V  c'd 
=  (abvb')d'  ye'd. 

We  also  have 

((a\b)\(c\d))  =  (R\R)yy(a\b)(c\d). 

Indeed, 

(a|/3)  =  (a|6)-(c|d)  vy. 

Thus, 


((a|6)|(c|d))  =  (i?|tf)((al&)(c|<2)  V  ))  V  (a\b)(c\d) 

=  (i?|/?)(c|b)(c|d)  V  (tf|f?)y V  (a\b)(c\d) 
=  (R\R)yy(a\b)(c\d). 


The  point  of  the  equality  ((a  j b)  | (c | d))  =  (fi|i?jy  v  (a\b)(c\d)  is  that  there  is  a  special 
element  ye  R  such  that 

Wy  V  (a\b)(c\d)  =  (R\R)(c\d)*((a\b)(c\d))  V  (a\b)(c\d). 

Of  course  we  are  identifying  y  with  (yj  7).  Now,  for  any  set  of  subsets  S  of  R,  let 
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u(S)  denote  the  union  of  all  the  sets  in  S.  Noting  that 

m\R)(x\l))  =  u{(<z|fc)0t|2) :  (fl\b)  e  R\R} 

=  yj{(a  +  Rb')(t+RO):(a\b)eR\R) 

=  v{(a  +  Rb')x:(a\b)eR\R} 

=  Rx, 

and  that  for  two  sets  S  and  T,  u(S  Y  T)  =  u(S)  V  u(T),  makes  the  proof  of  the  following 
theorem  transparent. 

Theorem  3.  For  a,  b,  c,  de  R, 

u[(a|h)|(c|d)]  =  (ab\b(a'd'  V  cd ))  6  ((a|fc)|(c|d)). 

Proof.  We  have 

u[(a|Z;)|(cld)3  =  u[(/?|tf)yv  (a\b)(c\d)] 

=  v(R\R)yHa\b){c\d) 

=  Ryv  (a\b)(c\d) 

=  (P\Y)  na\b)(c\d) 

=  (abcd\ abed  V  Y(fl'b  V  c'd  V  bd )) 

=  (abcd\b(a'd'  V  cd) 

=  (ab\b(a'd'  V  cd)). 

To  see  that  (ab \  b(a'd'  V  cd))  e  ((a  |  b)  |  (c  |  d)),  simply  verify  that 

{ab  j  b{a '  d ’’  V  cd))  (c  |  d)  =  (a  |  b){c  |  d).  o 

One  may  view  u  as  a  binary  operation  on  R  |i?,  with 

u((a|h),  (c|d))  =  (ab\b(a'd'  V  cd)). 

Now  Calabrese  (1987)  has  defined  a  binary  "conditioning"  operation  on  R\R  which  is  his 
candidate  for  iterated  conditioning.  His  operation  is  given  by 

(a\b)o(c\d)  =  (ab\b(d'  v  c))  =  (ab\b(d -> c)). 


As  a  simple  check  shows,  it  is  not  true  that 
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((a|&)(c|d))oCc|d)  =  (a|6)o(c|d), 


while  it  is  the  case  that 

«a|6)(c|d)|(c!d))  =  ((a|S)|(c|d)), 

and  hence  that 

u((fl  I  b)(c  I  d)  I  (c  I  d))  =  u((a  j  b)  |  (c  |  d)). 

However,  when  (a\b)  £  (c| d), 

0[(a|h)|(cjd)]  =  (ab\b(a'd'  V  cd)) 

=  0 ab\b(d '  V  c))  =  (ab\b(d-c)). 

Therefore,  the  binary  operations  u  and  Calabrese’s  "conditioning"  operator  on  R\R 
agree  on  pairs  ((a|b),(c|<f))  with  (a\b)  <,  (c\d). 

If  b  =  d  =  1,  then  u(a|c)  =  (a|c),  so  U  is  onto.  If  b  =  d,  then 

u[(a|b)|(cjfe)]  =  (ab\bc)  =  (a\bc) . 

If  d=  1  and  b  =  c,  then 

vl(a\b)\b)  =  (ab\b)  =  (a\b). 

Thus  u  produces  "compatible”  solutions,  at  least  in  the  special  cases  considered  at 
the  beginning  of  this  section.  It  is  obvious  that  u  preserves  logical  operations.  Moreover, 
the  restriction  of  Lf :  (/?}/?) j(/?|i?)  -*R\R  to  {(a|h)|(c|d) ;  a,b  e  R}  is  an  isomorphism 
for  each  pair  c,  d  e  R.  Also  the  restriction  to  {(<x|h)|(c|6)  :  a,  c  e  i?}  is  an 
isomorphism  for  each  b  e  R  To  prove  these  facts,  only  injectivity  needs  to  be  verified. 
For  the  first,  suppose 


Ulfci  j  hO  |  (c  j  a)]  =  u[(n2 1  h)  j  (c  |  d)]. 

By  Theorem  3,  we  then  have: 

faibi\bi(fli'd'  V cd))  =  (flpbi 1 d ’  V cd)), 


that  is, 


CD 


|  a\b\cd  =  aibicd 
[  b\(a\J-‘  V  cd)  -  bifai'd'  V  cd) . 


Since  b\cd  =  afi\cd  V  a\b-xcd ,  (1)  is  equivalent  to 
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{a  \b\cd  =  02b  2  cd 
ai'biid  -*c)  -  a2'b2(d -» c)  . 

Now 

(fii\b]){c\d)  -  (flibicd\aibi  V  c'd  V  bid) 

=  (flibjCd\a'.bi(d  -*c)V  c'd  V  bid)', 

since 

a/6i  V  c V  V  M  =  (ai'b-Xc'd)'  V  c'd  V  b{d 
=  Oibid'  V  c)  V  c'd  V  bid 
=  (fli'bdid  -» c)  V  c'd  V  bid. 

Also,  observe  that 

[(ai'bi)(d-»c)Vc/dlvM 
=  [(ai'&iXrfHc)  V  c'd]  V  WKfli'W-c)  V  c'd]' 
=  [fli'6i(d  -+  c)  V  c'd]  V  flj&jcd, 

with  the  last  union  being  a  disjoint  one. 


Thus,  (2)  implies 


(fli\bi)(c\d)  =  (a2\bj)(c\d). 


and  hence 

(foil  MW)  =  (foalMW)- 


To  prove  the  second  fact,  suppose  that 

0[(ai  I  b)  j  (Ci  |  b)]  =  u[(a2 1  b)  |  (c2 1  b)), 

that  is, 

aibci\bc{)  =  (a2bc2|hc2), 
or 

!&\bc\  -  a2bc2 
bc\  =  oc2 

Now,  (Oj|b)(Ci|b)  =  (flxbci  \  b).  Thus  (3)  implies  that 

(ai|fc)(ci|b)  =  (fl2!b)(c2jb). 
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Also,  (3)  implies  that 


(fii\b)  =  (c2|  b), 

and  hence 

(.(al\b)\(c1\b))  =  ((a2\b)\(c2\b)).  „  □ 

8.2  Non-monotonic  logics  on  conditionals 

This  section  discusses  non-monotonic  entailment  relations  in  conditional  logic  (CL). 
In  Chapter  6,  the  building  block  for  a  conditional  probability  logic  (CPL)  is  the  base  space 
R\R  together  with  Lukasiewicz's  three-valued  logic.  Conditional  probabilities  were 
introduced  into  the  analysis  mainly  for  purpose  of  reasoning  under  uncertainty.  Of  course, 
other  uncertainty  measures  could  be  used  instead  of  probability  (see  for  example, 
Goodman,  Nguyen  and  Rogers,  1990).  This  is  basically  a  numerical  approach  to 
reasoning  with  uncertainty  in  the  sense  that  the  uncertainty  involved  is  taken  into  account 
in  a  quantitative  way.  However,  qualitative  approach  to  reasoning  can  be  carried  out  at 
the  level  of  CL.  In  view  of  the  structure  of  R\R,  qualitative  notions  will  be  compatible 
with  quantitative  ones.  The  need  to  manipulate  conditionals  qualitatively  is  apparent  in 
problems  such  as  combination  of  rules  in  expert  systems.  Our  concern  here  is  to  extract 
some  non-monotonic  aspects  of  CL  as  well  as  to  discuss  the  possibility  for  building 
non-monotonic  entailment  relations  on  R  |i?. 

In  the  case  of  classical  two-valued  logic  (C^).  truth  is  the  only  primitive  notion.  As 
stated  in  Section  6.3,  the  logical  entailment  relation  b  in  G>  is  defined  in  terms  of 
models  (homomorphisms  Cl  from  R  to  {0,1},  or  equivalently,  maximal  filters  of  R). 
In  turn,  b  is  expressed  in  terms  of  the  order  relation  £  on  R  by  b  I-  a  if  and  only  if 
b  <  a.  Now,  since  for  c  e  R,  bc<b,  we  see  that  if  bb  a  then  for  c  e  R,  be  b  a.  This 
property  of  I*  is  referred  to  as  "monotonicity,"  that  is,  roughly  speaking,  additional 
evidence  will  not  affect  the  validity  of  previous  logical  conclusions.  In  this  sense,  G,  is 
called  a  monotonic  deduction  system,  or  the  logic  C,  is  monotonic.  In  this  case,  the 
monotonicity  of  b  is  due  to  the  transitivity  property  of  <  From  an  axiomatic  approach 
to  entailment  relations  (for  example,  Gabbay,  1985),  the  monotonic  b  satisfies 

(i)  reflexivity:  for  a,  b  e  R,  ab  H  a, 

(ii)  monotonicity:  if  bb  a,  then  for  ce  R,  be  b  a,  and 

(iii)  transitivity  (or  cut):  if  abb  c  bb  a,  then  bb  c. 

PL  is  also  monotonic  since  probability  is  compatible  with  the  order  relation  <  on 
R.  To  capture  common  sense  reasoning,  some  form  of  "non-monotonic"  deduction  is 
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desirable.  Roughly  speaking,  an  entailment  relation  K  in  some  logic,  is  nonmonotonic  if 
in  light  of  new  evidence,  previous  logical  conclusions  may  fail.  Specifically,  H  is 
nonmonotonic  if  the  monotonicity  property  (ii)  above  does  not  hold.  The 
non-monotonicity  of  a  logical  system  refers  precisely  to  an  entailment  relation  in  it.  Thus, 
a  logic  can  have  both  a  monotonic  entailment  and  a  nonmonotonic  one. 

Examine  again  the  H  in  ^  applications,  given  a  set  of  data  [bj, bn},  the 
relation  H  is  used  to  express  the  fact  that  some  a  follows  logically  from  the  data, 
written 

{bv  ...,  bn)  H  a. 

There  are  two  procedures  in  this  deduction  process.  First,  combination  of  evidence  is 
taken  as  "conjunction"  which  is  the  ring  multiplication.  Second,  b  is  defined  as  <  This 
(partial)  order  relation  on  R  is  defined  precisely  in  terms  of  A,  via  a,  b  e  R,  b  <  a  if,  by 
definition,  a  hb  =  b.  Thus,  in  order  to  break  the  monotonicity  of  a  system,  one  can  either 
consider  combination  of  evidence  differently  or  define  b  independently  of  <  We  will 
return  to  this  issue  shortly. 

We  proceed  now  to  clarify  the  statement  that  "probabilistic  reasoning  captures  a 
form  of  non-monotonic  reasoning."  We  know  that  PL  is  monotonic.  What  makes 
"probabilistic  reasoning"  non-monotonic  depends  on  the  framework  of  inference.  Suppose 
we  consider  the  (partial,  quantitative)  entailment  of  an  event  a  from  a  collection  of 
events  {bj, ....  bn]  as  a  conditional  probability  P(a\bj  A  ...  A  b^,  denoted 
{bj, ...,  bn)  b  a  with  degree  P(a  | bj  A  ...  A  b^).  In  other  words,  this  partial  entailment 
relation  is  non- monotonic.  Note  that  the  two  primitive  notions  involved  here  are  truth  and 
probability. 

It  is  possible  to  express  the  above  aspect  of  non-monotonicity  in  a  qualitative 
fashion.  Indeed,  in  the  CL  (Chapter  6),  we  have  (a  1 6)  <  (c  |  ti)  if  and  only  if  ab<cd 
and  c'd<a'b,  and 

CL 

V  (c|d) 

is  defined  as 

£  (c|d). 

Now,  (a\b)  and  (a | be)  are  not  comparable  in  general,  since  we  always  have  abc  <  ab, 
but  not  a'b  <  a' be,  in  general.  On  the  other  hand,  the  structure  of  R\R  is  such  that 
probabilities  are  compatible  with  operations  on  /?[/?,  in  particular  P  preserves  the 
(partial)  order  relation  £  on  R\R.  Note  also  that,  for  the  purpose  of  automation, 
syntactic  representation  of  b  is  desirable. 

Now  from  R  (base  space  of  C,),  we  go  to  R\R  (base  space  of  CL).  The  truth 
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space  of  R\R  is  [0,  u,  /}.  Note  that,  in  the  analysis  of  reasoning  processes  in  AI, 
three- valued  logics  often  surface,  for  example,  in  Computation  Theory  (McCarthy,  1967), 
in  the  semantics  of  non-monotonic  entailment  (Sandewall,  1989),  and  in  modeling  of 
default  rules  (Dubois  and  Prade,  1990). 

Since  the  truth-space  of  R\R  is  a  three-element  set,  one  can  consider  various  logics 
on  In  other  words,  the  class  of  ail  possible  logics  for  R\R  is  that  of  all 

three-valued  logics.  Theorem  2  of  Section  3.4  established  their  correspondences  with 
logical  operators  and  relations  on  R\R,  that  is,  at  the  syntax  level.  Depending  upon 
interpretations  of  conditional  objects  and  intuitive  logical  aspects  of  problems  at  hand, 
different  choices  of  three-valued  logics  can  be  made.  For  example,  for  reasoning  in  the 
theory  of  computable  partial  recursive  functions,  non-commutative  three-valued  logics 
might  be  appropriate  (for  example,  Guzman  and  Squier,  1990;  see  also  Section  3.4).  In  a 
direction  related  to  quantum  logic,  non-distributive  systems  can  be  looked  for  (for 
example,  Schay,  1968). 

As  far  as  commutative  three-valued  logics  are  concerned,  the  standard  literature  is 
summarized  in  Rescher  (1969).  As  stated  earlier,  different  choices  of  connectives  for 
three-valued  logics  (that  is,  truth  tables)  lead  to  different  logical  operators  on  R\R.  Thus, 
for  example,  Lukasiewicz,  Sobocinski  and  Bochvafs  logics  correspond  respectively  to  our 
operators  in  Chapter  3,  Adams-Calabrese-Schay's  operators,  and  an  alternative  system  of 
Schay.  (See  Section  3.5;  also  Dubois  and  Prade,  1989, 1990). 

Consider  first  the  case  of  Lukasiewicz"  logic  on  R\R,  corresponding  to  operators 
A,  V  of  Chapter  3.  Suppose  data  consist  of  conditional  information,  or  conditionals  are 
viewed  as  production  rules  in  expert  systems.  A  simple  way  to  express  the  fact  that  the 
conditional  information  (or  rule)  (e\f)  follows  logically  from  the  data  [a\b,  (c|d)}  is  to 
define  f*  as  {(a|d),  (c|d)}  b  (e\f)  if  and  only  if 

(a\b)  A  (cjd)  <  (e[/) . 

This  deduction  process  is  exaedy  the  same  as  in  the  case  of  and  hence  is  monotonic. 
As  suggested  by  Dubois  and  Prade  (1989),  one  way  to  destroy  the  monotonicity  of  H  is 
to  modify  it  at  the  combination  of  evidence  level.  Instead  of  using  Lukasiewicz" 
conjunction  A,  one  might  replace  it  by  another  one,  for  example,  Sobocinski's,  (See 
Chapter  3.)  The  reason  is  this.  Since  <  on  R\R  is  defined  as 

(a  jb)  <  (c|d)  if  and  only  if  (a|£>)  A  (cjd)  =  (a|b), 

as  on  R,  the  transitivity  of  <  ,  coupled  with  this  definition,  is  responsible  for  the 
monotonicity  of  K  If  A  is  replaced  by  Adams-Calabresc-Schay’s  conjunction  hQ  then 
b  is  non-monotonic,  where 
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{(a\b),(c\d))b0(e\f) 

if,  by  definition, 

<fi\b)h0ic\d)Z{e\f); 

and  where 

a\b)  Kq  (c\d)  =  ((b'  V  a)(d'  V  c)\b  V  d). 

Indeed,  suppose  (a\b)£(e\f).  By  inspection,  we  see  that 

(a\b)  KQ(c\d)<>(a\b) 

does  not  hold,  so  that,  in  general,  (e  \f)  might  not  follow  from  { (a  \  b),  (c  \  d) } . 

Note  that,  in  view  of  Theorem  1,  Section  3.3,  the  order  relation  £  on  R\R  can  be 
defined  by  (a\b)  £  (c\d)  if  ab£cd  and  c'd^a'b,  that  is,  by  using  only  the  ring 
structure  of  R ,  without  calling  upon  A.  For  other  order  relations  on  R  |R,  see  the  recent 
work  of  Calabrese  (1990). 

Another  way  to  modify  H  to  obtain  non-monotonicity  is  suggested  by  Sandewall 
(1989).  First,  to  define  "partial  interpretations,"  Sandewall  considered  Kleene  three-valued 
base  logic.  By  bass  logic,  we  mean  truth  tables  of  the  three  basic  connectives  "not,"  "and" 
and  "or".  This  is  the  same  as  Lukasiewicz's  three-valued  base  logic  (Rescher,  1969,  p. 
34).  The  main  difference  between  the  two  logics  lies  in  the  concept  of  implication.  Thus, 
in  our  setting,  R | R  is  equipped  with  operators  ',  A,  and  V  of  Chapter  3.  The  logical 
cntailment  relation  is  next  defined  by  introducing  a  preference  order  on  the  set  of  models 
(partial  interpretations).  For  details,  see  Sandewall  (1989).  This  is  in  line  with  the  general 
methodology  advocated  by  Hawthorne  (1988)  for  building  non-monotonic  logics.  To 
Achieve  non-monotonicity,  one  should  generalize  the  classical  concept  of  models  by  taking 
more  primitive  notions  than  just  "truth."  In  Hawthorne's  words  "there  is  more  to  the 
meaning  of  a  sentence  than  the  determination  of  truth-values  at  possible  worlds."  One 
should  also  take  "<  ntailment"  as  a  primitive  notion.  That  means  an  entailment  relation 
should  be  autonomous  with  respect  to  truth-values  semantics.  Then,  as  in  the  case  of 
"truth"  as  a  primitive  notion,  once  an  entailment  concept  has  been  taken,  one  will  specify 
its  "semantic  rules"  (in  the  same  way  that  truth  tables  of  logical  connectives  specify  how 
truth  values  of  compound  formulae  are  assigned)  governing  deduction  processes.  For  an 
axiomatic  approach  to  non-monotonic  entailment  relations,  see  Gabbay  (1985).  Recent 
relevant  papers  on  non-monotonic  logics  include  Grosof  (1988),  Bibel  (1986),  McLeish 
(1988). 
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8.3  Operations  on  cosets  of  regular  rings. 

The  algebraic  sturctures  more  general  than  Boolean  rings  that  are  pertinent  for  our 
considerations  of  conditional  events,  iterated  conditionals,  and  so  on,  seem  to  be  lattices  of 
some  sort  rather  than  more  general  rings.  For  example,  in  Chapter  4,  we  extended  R, 
which  is  both  a  Boolean  ring  and  equivalently  a  Boolean  lattice,  to  the  space  R\R  of 
conditional  events.  This  space  is  a  Stone  algebra,  which  is  a  lattice  more  general  than  a 
Boolean  lattice.  It  is  not  a  ring.  That  is,  /?[/?  generalizes  R  as  a  lattice,  not  as  a  ring. 
There  is  the  possibility,  however,  of  generalizing  this  process  of  going  from  R  to  R\R 
by  starting  with  a  ring  more  general  than  a  Boolean  one.  Now  R\R  is  the  set  of  all 
cosets  of  principal  ideals  of  R,  and  the  operations  between  its  elements  were  defined  to  be 
those  induced  by  the  operations  on  R.  That  is,  if  A  and  B  are  subsets  of  R,  and  *  is 
any  binary  operation  on  R,  then,  by  definition,  A*B  =  {ab  :  a  e  A,  b  e  B).  In  the  Boolean 
ting  case,  addition  and  multiplication  between  cosets  yielded  cosets.  In  fact,  for  a,beR, 
and  ideals  I  and  J  of  R, 


and 


{a  +  I)  +  (b  +  J)  -  (a  +  b)  +  (/  +  J), 


(a  +  I)-(b  +  J)  =  ab  +  lb  +  aJ  +  IJ. 


These  facts  were  thoroughly  discussed  in  Chapter  3.  These  operations  on  R\R  were  the 
basis  of  its  development  While  the  set  addition  of  cosets  is  a  coset  holds  in  any  ring  and 
is  easily  verified,  the  fact  that  the  set  product  of  cosets  is  a  coset  is  unexpected  and 
non-trivial.  The  question  naturally  arises  as  to  the  generality  of  this  phenomenon.  In 
particular,  for  what  rings  does  it  hold?  In  this  section,  we  will  show  that  it  holds  for 
commutative  von  Neumann  regular  rings.  In  Boolean  rings,  every  element  is  an 
idempotent,  and  these  regular  rings  are  good  candidates  for  such  an  extension  because  of 
the  abundance  of  idempotents  in  them.  Our  principal  result  is  Theorem  4,  the  extension  of 
Theorem  1  of  Section  3.2  to  these  more  general  rings. 


Definition  1.  A  commutative  ring  R  is  (yon  Neumann)  regular  if  it  has  an  identity,  and  if 
for  each  x  e  R,  there  is  a  y  e  R  such  that  xyx  -  x. 


We  will  call  these  commutative  von  Neumann  regular  rings  simply  regular  rings.  Here 
are  some  examples  of  regular  rings: 

(7)  Any  Boolean  ring  is  a  regular  rin'' 

(2)  Any  field  is  a  regular  ring. 
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(3)  The  Cartesian  product  of  any  family  of  regular  rings  is  a  regular  ring.  The  ring 
operations  in  such  a  product  are,  of  course,  componentwise. 

(4)  Quotients  of  regular  rings  are  regular  rings.  That  is,  if  R  is  a  regular  ring  and 
/  is  an  ideal  of  R,  then  R/I  is  a  regular  ring. 

(5)  p-rings  are  regular  rings.  These  are  rings  such  that  for  some  prime  p  and  every 

element  x,px  =  0  and  jP  =x.  Boolean  rings  are  those  for  which  p  =  2. 

For  an  element  x  in  a  regular  ring,  the  element  y  such  that  xyx  =  Py  =  x  is  not 
unique  since,  for  example,  in  a  Boolean  ring  one  may  take  y  to  be  x  or  2,  the  element 
xy  is  unique.  We  denote  it  x°. 

Lemma  1.  Let  R  be  a  regular  ring.  Then  for  all  x  e  R, 

(i)  x°  is  unique, 

(ii)  x°  is  an  idempotent,  and 

(iii)  Rx  =  Rx°. 

Proof.  For  (i),  if  (xy)x  =  (xz)x  =  x,  then  xy  =  xzxy  =  xz.  For  (ii),  (ry)(ry)  = 
(pcyx)x  =  xy.  Finally,  for  (iii),  clearly  R(xy)  cRx.  If  a~rxeRx,  then 
a  -  (rx)(xy)  e  R(xy),  whence  22x°  =  Rx.  □ 

Theorem  1.  Let  R  be  a  regular  ring.  The  following  hold. 

(i)  For  any  principal  ideal  Ra  of  R,  Ra  =  Re  for  a  unique  idempotent  e. 

(ii)  P  =  I  for  any  ideal  1  of  R. 

2 

(iii)  Ra  =  Ra  for  any  ae  R. 

(iv)  Finitely  generated  ideals  of  R  are  principal. 

(v)  For  ideals  I  and  J  of  R,  we  have  IJ  =  (ij :  i  el,  je  J }  is  an  ideal. 

Proof.  To  prove  (i),  Ra  =  Ra°  with  a.0  idempotent  by  Lemma  1.  If  Re  =  Rf, 
with  e,  f  idempotents,  then  e  =  rf  and  f  -  se  for  suitable  elements  r  and  s  of  r,  and 

e  =  rf  -  rse  =  rsef-  f. 

For  (ii),  clearly  P"  c  7.  If  i  e  7,  then  i  =  i°i  e  P .  Now  (iii)  follows  since 

O 

Ra  =  RaRa  =  Ra  by  (ii).  To  get  (iv),  we  need  that  Ra\  +  Rai  +  ...  +  Ra^  =  Ra  for 
some  a  e  R.  We  may  assume  that  each  Oj  is  idempotent  Now, 


Ra\  +  Ra2  =  R(a\  +  a2  - 
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since  +  a2  -  a^)  =  a-t  for  i  =  1, 2,  whence  Ra\  4  Rax  c  R(ai  +  ax-  a\afj.  The 
other  inclusion  is  easy. 

Finally,  to  prove  (v),  IJ  is  closed  under  multiplication  by  any  element  a  of  R 
since  a(if)  =  ( ai)j  with  ai  e  I,  j  e  J.  We  need  i\j\  +  ...  4  ijn  e  IJ  for  ik  €  7,  yk  e  J, 
k  =  I, ...,  n.  From  (iv),  let  Rj\  +  Rjx  4  ...  4  Rjn  =  Rj.  Then 

id  l  +  ...  +  Un=  O'Fi);  +  ...  +  (Va ); 


for  suitable  rk,  k  =  1, ...,  n.  o 

The  following  is  a  characterization  of  regular  rings  in  terms  of  products  of  cosets  of 
the  same  ideal. 


Theorem  2.  Let  R  be  a  commutative  ring  with  identity.  Then  R  is  regular  if  and  only  if 
the  set  product  of  any  two  cosets  of  an  ideal  l  is  the  product  of  those  two  cosets  as 
elements  of  the  quotient  ring  R/I.  That  is,  R  is  regular  if  and  only  if 

(a  4 1)(b  +I)  =  ab  +  I 


for  each  ideal  I  of  R,  and  a,  b  e  R. 

2 

Proof.  If  the  equality  above  holds,  then  taking  a  =  b  =  0  yields  /  =  I  for  all 

ideals  I.  Taking  I  =  Rx  gets  RxRx  =  Rx  =  Rj?,  so  that  x  =  yj ?  for  some  y  in  R. 
Thus  R  is  regular.  Now  assume  that  R  is  regular.  We  need 

( a  4  I)Qj  +  I)  =  ab+I, 

or  that 

{(fl  4  i){b  4  j) :  i,j  e  1}  -  {ab  +  ib  4  aj  +  ij  :  i,j  e  7} 

=  [ab  4  k  :  k  e  /}. 

Clearly,  (a  4 1){b  4  7)  c  ab  4  /.  We  need  to  write  ab  4  k  in  the  form  ab  4  ib  4  aj  +  ij. 
Letting  i  =  k?(l  -  a)  and  j  =  k°a(k  -  b  +  ab)  accomplishes  that 

□ 


Note  that  Theorem  2  yields  the  ideal  theoretic  characterization  of  regular  rings, 
namely  that  a  ring  is  regular  if  and  only  if  I2  =  I  for  all  ideals  7. 

We  now  turn  to  the  problem  of  showing  that  the  set  product  of  two  cosets  of  ideals 
of  a  regular  ring  is  again  a  coset.  Specifically,  we  will  show  that 
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( a  +  I)(b  +J)  =  ab  +  aJ  +  bI  +  IJ. 

To  do  this,  we  investigate  the  quantity 

K(a,  b,  I,  J)  =  [aj  +  bi  +  ij :  i  6  I,j  e  /}. 


We  have 


(< a  +  T)(b  +J)  =  ab  +  aJ  +  bI  +  IJ  =  ab  +  K(a,  b,  I,  J ). 

In  general,  K(a,  b,  1,  J)  is  not  an  ideal  of  R.  However,  if  we  let 

KQ(a,  b,I,J)  =  a(IJ)  +  aJ  +  bi 

where  o(IJ)  denotes  the  ideal  generated  by  IJ,  then  b,  I,  J)  is  always  an  ideal, 
and,  moreover,  we  have: 


Lemma  2.  Let  R  be  a  commutative  ring  with  unit  1.  Then  for  a,  fee  R,  and  I,  J  ideals 
of  R, 


aJvblc  K(a,  b,  I,  J)  c  KQ(a,  b,  I,  J)  -  cXJJ)  +  K(a,  b,  I,  J). 
Proof.  Since  0  is  in  any  ideal,  it  follows  that 

aJvblc  K(a,  b,  I,  J). 

Next,  if  i  e  /  and  j  e  J,  then  ij  €  <7(/J),  hence 

K{a,  b,I,J)cKo(a,b,I,J). 

Clearly 

ciJI)  +  K(a,  b,I,J)c  KJa,  b,  /,  J). 

Conversely,  let  k  e  c(IJ).  We  have 

aj  +  bi  +  k=  (k-  if)  +  (aj  +  bi  +  if)  6  o(IJ)  +  K(a,  b,  I,  J). 


Lemma  3.  Let  R  be  a  commutative  ring  with  identity.  The  following  are  equivalent. 

(0  For  a,  b  6  R,  and  I,  J  ideals  of  R,  K(a,  b,  I,  J)  is  an  ideal. 

( ii )  For  a,  b  6  R,  and  I,  J  ideals  of  R,  K(a,  b,I,J)  =  KQ(a,  b,  I,  J). 

( Hi )  otTT)  c  K(a,  b,  I,  J),for  a,  b  e  R,  and  I,  J  ideals  of  R. 
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Proof.  That  (ii)  implies  (i)  is  obvious.  Assume  (i).  Then  for  i  e  I,  j  e  /,  we  have 
ij  =  ((/  +  ja  +  bi)  -  (ja  +  bi)  e  K(a,  b,  I,  J), 

since 

aJ  c  K(a,  b,  I,  J),  bi  c  K(a,  b,  I,  J). 

Thus  (i)  implies  (iii).  Assume  (iii).  In  view  of  Lemma  2,  it  suffices  to  show  that, 

Ko(a,b,I,J)cK(a,b,I,J). 

For  this  purpose,  let  aj  +  bi  +  k  e  Kjia,  b,  I,  J ).  Then 

aj  +  bi  +  k  =  (aj  +  bi  +  if)  +  (k  -  if) 

with  k  -  ij  e  o(IJ).  Now,  by  hypothesis,  (iii)  holds  for  any  a,  b  in  R .  Thus  taking 
c  =  a  +  i,  d=  b  +  we  have  a(IJ)  c  K(c,  d,  I,  J).  That  is,  k  -  ij  is  of  the  form 

(a  +  0/i  +  (b  +  j)0  +  i]j\ 


for  some  ix  e  I,  jl  e  J.  Hence 

aj  +  bi  +  k  =  aj  +  bi  +  ij  +  (a  +  0/i  +  (6  +  y)0  +  i’i/i 

=  a(j  +A)  +  ^0’  +  0)  +  y  +  y‘i  +;0  +  i^\ 

=  a(j  +  ;'i)  +  b(i  +  i{)  +  (i  +  0)0'  +  j\)  e  K(a,  b,  I,  J).  o 

Theorem 3.  Let  R  be  a  regular  ring.  Then  for  a,  be  R,  and  ideals  I,  J  of  R, 

(a  +  I)(b  +  J)  =  ab  +  a!  +  bi  +  IJ  e  &( R ). 

Proof.  By  Theorem  1,  o(IJ)  =  IJ.  But  IJ  =  In  J.  Clearly,  IJ  cl  nJ.  Conversely, 
if  ae  I  nJ,  then  since  R  is  regular, 


a  =  ( aa°)a  e  IJ. 


Thus  if  r  e  I  nJ, 


i  =  (rr°)(l  -a)  el, 


and 


j  =  (rf°)(r  -  b  +  ab)  e  J, 
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so  that 


r=  aj  +  bi  +  ij  e  K(a,  b,  I,  J), 


c(IJ)  cK(a,  b,  I,  J). 


In  view  of  Lemma  3,  we  then  have 

K(a,  b,I,J)  =  KQ(a,  b,I,J)  =  aJ  +  bI  +  IJ.  o 

We  have  just  seen  that  if  R  is  a  regular  ring,  then  set-extenstion  operations  of 
addition  and  multiplication  are  operators  on  the  space  Sf(R)  of  all  cosets  of  R,  extending 
coset  operations  on  each  fixed  quotient  ring.  Of  cours*,  by  Theorem  2,  this  property  is 
unique  to  regular  rings.  However,  it  is  not  known  which  rings  have  the  property  that 
products  of  cosets  are  cosets,  or  indeed  if  having  this  property  is  unique  to  commutative 
regular  rings. 

To  extend  Theorem  1  of  Section  3.2  to  regular  rings,  we  define  analogs  of  '  and  V 
for  regular  rings.  For  a,  be  R,  let 


and 


a'  -  1  -  a, 


ayb  =  a  +  b-ab. 


These  operations  are  extended  to  subsets  of  R  as  usual.  For  A,BcR, 

A'  =  {1  -  a  :  ae  R), 

A  V  £  =  [a  +  b  -  ab  :  a  e  A,  be  5}. 

One  should  note  that  A  V  b  is  not 

A  +  B  -  AB  =  [a  +  b  -  cd  :  a,  c  e  A,  b,  d  e  B}. 

However,  DeMorgan  laws  do  hold. 

(AB)'  =  A'  VS', 

(A  V  B)'  =  A'B'. 


The  following  theorem  is  a  generalization  of  Theorem  1  in  Section  3.2. 
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Theorem  4.  Let  R  be  a  regular  ring.  Then  for  a,beR,  and  ideals  I,  J  of  R, 

(i)  (a  + 1)  +  (b  +  J)  =  (a  +  b)  +  (I  +  J), 

(ii)  (a  +  7)  •  (b  +  J)  —  ob  +  aJ  +  bl  +  IJ, 

(i Hi )  (a  +  [)'  -ar  + 1, 

(iv)  (a  +  I)V(b  +  J)  =  a\b  +  (a'J+b'I  +  IJ). 

Proof,  (i)  and  (ii)  have  been  proved  previously,  (iii)  is  easy.  For  (iv),  we  use  one 
of  the  DeMorgan  laws  above  and  (ii).  We  have 

(a  +  7)V(h  +  J)  =  ((a'  +  I)(b'  +  J)Y 
=  7  -  (a'b'  +  a'J  +  b'l  +  IJ) 

=  flV  b  +  a'J  +  b'l  +  IJ.  o 

The  difficult  part  of  Theorem  4  is  (ii).  It  was  proved  by  inspecting  the  quantity 
K(a,  b,  I,  J).  There  is  a  more  direct  proof,  which  goes  as  follows.  First,  assume  that  7 
and  J  are  principal  ideals.  Let  7  =  Re,  J  =  Rf  with  e,  f  idempotents.  It  suffices  to 
solve  the  equation 


ij  +  ib  +  ja  =  zVi  +  i2b  +  j2a 

for  i  e  I,  j  e  J,  where  i\,  i2  6  I;  ji.ji  e  7.  Letting 

i  =  (x  -  a)ef  +  i2(l  -f)e, 
j  =  U-b)ef+j2V-e)f 

where  x  =  +  i2b  +  j2a  +  ab,  yields  a  solution. 

For  the  general  case,  by  Theorem  1,  the  ideal  Ri  +  Ri\  +  Ri2  is  a  principal  ideal 
Re,  and  Rj  +  Rji  +  Rj2  is  Rf  with  e,  f  idempotents.  Thus,  the  principal  ideal  case 
finishes  the  proof.  □ 

There  are  other  analogs  for  '  and  V  on  a  regular  ring  than  the  ones  we  defined 
above.  An  alternative  is  this.  In  analogy  with  the  Boolean  case,  define,  for  a.beR, 

(a\b)  =  [xe  R  :xb  =  ab}. 

Then,  assuming  throughout  that  R  is  regular, 

(a\b)  =  a  +  R{1  -  b°). 


Indeed,  first  observe  that 
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a  +  R(1  -  b° )  =  ab°  +  R(1  -  b°). 

This  can  be  seen  as  follows.  If  x  e  a  +  R(1  -  b°),  then 

x-  a  +  r{l  -b°) 

=  a(l-b°  +  b°)  +  r{l-b0) 

=  ab°  +  (a  +  r)(l  -  b° ) 

which  is  in .  ab°  +  R(1  -  b°).  Conversely,  for  x  =  ab°  +  s(l  b°), 

x  =  a(l  - 1  +  b°)  +  s(l  -  b°) 

=  a  +  (s-  a){l  -  6°)  ' 

which  is  in  a  +  R{1  -  b°). 

Now  let  x  e  (a  j  b),  that  is,  xb  =  ab.  Multiplying  through  by  b°  yields  xb°  =  ab°. 

Thus 

x  =  x(l-b°  +  b°) 

=  x(l-b°)+xb° 

=  x(l  -  b° )  +  ab°, 

which  is  in 

ab°  +  R(1  -b°)  =  a  +  R(1  -  b°). 

Conversely,  if  x  =  ab°  +  r(l  -  b°)  for  some  reR,  then 

xb  =  ab°b  +  r{l  -  b°)b  =  ab.  □ 


The  fact  that  {x  6  R  :  xb  =  ab)  -  a  +  R{1  -  b°)  rather  than  a  +  R(1  -  b)  suggests 
that  one  might  want  to  define  '  on  regular  rings  by  a'  =  1  -  a°  rather  than  1  -  a.  In 
that  case,  in  order  for  DeMorgan's  laws  to  hold,  and  in  analogy  with  the  Boolean  case,  one 
should  define  V  by 


avb  =  (a'  hb'Y  =(a'b'Y 


=  ((2  -  a°)l  -  b°))' 

=  1 -a -a°-b°  +  a°b°)° 
=  a°  +  b°-  a°b°. 
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With  respect  to  these  operations,  a  regular  ring  R  satisfies  the  following  properties. 

(2)  a(ab  V  ac)  ~  abV  ac  and  ab  c  -  (a  M  c)(b  V  c). 

(2)  (a  V  b)'  =  a'b'  and  (ab)'  =  (a'  V  b’). 

(3)  a(a  V  b)  =  a  and  a  V  ab-  a0. 

The  verification  of  these  properties  is  completely  routine.  The  upshot  of  property  (1)  is 
that  V  distributes  over  products,  but  not  the  other  way  around.  Property  (2)  asserts  that 
DeMorgan's  laws  hold.  Property  (3)  is  one  absorption  law,  and  the  failure  of  the  other. 

There  does  not  seem  to  be  a  way  to  define  a  partial  order  £  on  R  in  terms  of 
these  operations  so  that  R  is  a  lattice.  In  fact,  defining  £  by  a<b  if  a  =  ab,  or  if 
a  =  ab°  does  not  yield  a  partial  order.  Anti-symmetry  is  not  achieved.  For  example,  for 
the  case  a  <  b  if  and  only  if  a  =  ab°,  if  a  £  b,  and  b  £  a,  then  a 0  =  bc ,  but  a  *  b 
unless  a  and  b  are  idempotents.  Thus,  this  alternate  definition  of  ',and  consequently  of 
V,  on  R,  utilizing  more  heavily  the  idempotent  part  of  R,  does  not  result  in  a  particularly 
tractable  algebraic  system  on  which  to  base  a  logic. 

It  is  instructive  to  see  what  Theorem  4  becomes  with  these  alternate  definitions  of 
and  V.  Of  course  parts  (i)  and  (ii)  do  not  change.  Some  properties  of  these  new 
operations  when  extended  to  cosets  follow.  Properties  (5)  and  (6)  are  the  analogs  of  parts 
(iii)  and  (iv)  of  Theorem  4  are  these. 

(4)  (a  +  Rb)°  =  R°b°  +a°b'. 

(5)  (a  +  Rb)'  =  R°b°  +  a'b'. 

( 6)  (a  +  Rb)  V  (c  +  Rd)  =  R°(bd  V  a'd  V  be')  +  (ab'  V  cd'). 

To  give  a  better  appreciation  of  the  analogs,  we  present  a  proof  of  (5).  If  x  is  an 
element  of  a  regular  ring  R ,  then  there  is  an  element  y  such  that  xyx  =  x.  Denote  such 
an  element  by  xt.  Thus  xtx  =  x°.  Now  note  the  following  equalities. 

a  +  b  =  a  +  bb°  =  ab0'  +  (a  +  b)b°, 

(atb0'  +  (a  +  bW)(ab°'  +  (a  +  b)b°)  =  a°b0'  +  (a  +  b)°b°, 
and 


(a°b°'  +  (a  +  b)°b°)(ab°'  +  (a  +  b)b°)  =  a  +  b. 


Thus 


(a  +  b)°  =  a°b0'  +  (a  +  b)°b°. 


s  =  -  atf  +  (/  -  e)fct 
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and 


y  =  flt(2  -  b°)  +  b°(l  -  e) 


does  the  trick. 


For  further  work  on  developing  algebraic  properties  for  conditionals  on  regular 
rings,  see  Goodman  and  Nguyen  (1990). 

8.4  Miscellaneous  issues  and  open  problems 

It  is  time  to  summarize  our  work  and  to  discuss  open  problems. 

The  topic  of  conditioning  is  perhaps  very  old  since  it  is  central  to  empirical  sciences. 
However,  the  concept  of  "measure-free"  conditioning  has  not  been  studied  seriously  due  to 
a  lack  of  motivation.  It  is  the  fundamental  aspect  of  probabilistic  inference  in  expert 
systems  that  motivated  us  to  look  again  at  this  topic  and  to  formulate  a  rigorous  theory  of 
conditioning. 

The  subject  of  probabilistic  inference  in  expert  systems  has  attracted  considerable 
attention  among  researchers  in  artificial  intelligence  and  has  caused  much  discussion. 
Several  fundamental  ideas  and  methodologies  relating  to  conditioning  have  been  proposed, 
most  of  which  were  highly  appealing  on  common  sense  grounds.  However,  serious 
foundational  problems  have  been  encountered,  as  has  been  the  case  in  many  other  areas  of 
science.  Accordingly,  clarification  of  conditioning  at  the  basic  level  is  necessary.  The 
purpose  of  this  monograph  is  to  introduce  a  rigorous  theory  of  measure-free  conditioning 
which  can  be  utilized  in  inference  procedures  in  intelligent  machines.  The  theory 
developed  here  concerns  mainly  basic  mathematical  objects  such  as  ordinary  sets  and 
probability  measures.  It  can  be  regarded  as  a  first  step  that  will  lead  to  extensions  in 
various  directions  of  interest. 

Basically,  this  work  is  an  effort  to  provide  a  better  understanding  of  the  logics  of 
conditionals.  It  is  an  attempt  to  bring  conditional  logic  closer  to  the  level  of 
understanding  as  that  of  classical  logic.  Such  an  understanding  is  needed  since  more  and 
more  AI  techniques  rely  on  formal  methods  in  logic  to  guide  programming  in  intelligent 
machines.  Logics  can  be  viewed  as  knowledge  representation  languages  in  which  facts, 
roles,  and  inference  can  be  stated  and  manipulated. 

Uncertainty  modeling  is  a  tricky  business  in  AI.  Unlike  the  term  "conditionals"  used 
in  classical  two-valued  logic,  where  "conditional"  is  referred  to  as  material  implication, 
conditionals  or  implicative  statements  used  in  this  text  need  to  be  modeled  properly  in  the 
context  of  reasoning  with  uncertainty.  A  "measure- free”  approach  seems  to  be  the  most 
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objective  way  to  lay  the  first  brick.  However,  we  are  all  biased  by  the  popular  approach 
to  uncertainty  modeling,  namely,  probability  theory,  in  which  there  is  a  fundamental 
concept  of  probabilistic  conditioning.  We  have  tried  to  revise  the  work  of  others 
concerning  conditioning  concepts  to  be  compatible  with  probability  theory.  We  provided 
an  axiomatic  approach  to  conditionals,  and  built  a  conditional  logic.  This  should  stimulate 
further  work  to  improve  it  and  to  extend  it  towards  applications.  In  a  time  of  fast 
advances  in  AI  technologies,  we  hope  that  it  is  useful  to  have  a  monograph  on  the  subject, 
even  at  a  tentative  level.  Many  issues  remain  to  be  re-examined  and  much  further  work  is 
needed.  We  now  discuss  some  of  these  issues  and  some  open  problems. 

A.  Conditionals  on  more  general  algebraic  systems 

The  axiomatic  approach  in  Chapter  2  led  to  the  coset  form  for  condidonals  on 
Boolean  rings.  This  mathematical  representation  of  implicative  statements  is  satisfactory 
in  the  sense  that  it  reflects  earlier  thoughts  on  the  concept  of  conditioning  in  logic,  and 
coincides  with  that  derived  from  other  work  on  the  subject.  There  are  a  number  of  elegant 
characterizations  of  conditional  probability  without  any  reference  to  conditional  events, 
such  as  Aczel's  generalization  of  Renyi  axioms  (Aczel,  1966)  or  Cox’s  approach  (Cox, 
1961).  However,  DeFinetti  (1974)  and,  more  generally,  Lindley  (1982),  characterized 
conditional  probability  via  the  "Dutchbook,"  or  equivalently,  uncertainty  decision  game. 
This  does  use  (tacitly)  DeFinetti’s  conditional  event  indicator  function  (see  also  Goodman 
et  al  (1990)  for  a  modification  of  certain  of  Lindley’s  conclusions  concerning  the 
inadmissibility  of  uncertainty  measures).  In  connection  with  these  results,  it  is  of  some 
interest  to  attempt  to  relate  all  of  these  characterizations  with  the  standard  probability 
evaluation  we  use,  namely  P((a  |  b))  =  P(a  |  b). 

The  next  problem  has  been:  once  the  concrete  conditional  space  R\R  is  obtained, 
what  are  the  logical  operators  on  it?  From  a  "syntax"  viewpoint,  this  is  an  extension 
problem.  The  operations  on  the  Boolean  ring  R  need  to  be  extended  to  operators  or. 
R\R  which  capture,  in  some  reasonable  sense,  aspects  of  combination  of  evidence  in 
ruled-based  systems.  In  Chapter  3,  the  approach  is  algebraic.  It  is  motivated  by  an 
interesting  problem  in  ring  theory,  namely,  how  to  extend  appropriately  coset  operations 
on  each  quotient  ring  of  R  to  RjR?  It  turns  out  that  set-extension  operations  provide  a 
natural  solution  to  this  extension  problem.  In  this  way,  R\R  becomes  a  Stone  algebra 
(Chapter  4). 

All  that  was  done  for  Boolean  rings,  for  mathematical  interest  as  well  as  for 
applications.  Conditionals  on  more  general  algebraic  structures  now  need  to  be 
investigated.  In  Chapters  7  and  8,  we  have  touched  upon  two  generalizations:  fuzzy  sets 
and  regular  rings. 
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In  carrying  out  the  construction  of  R\R  from  R  for  R  a  commutative  regular  ring 
rather  than  a  Boolean  ring,  problems  arise.  We  can  certainly  let  R  |/2  be  the  set  of  all 
cosets  of  principal  ideals  of  R.  But  just  how  they  should  manipulated  in  order  to  provide 
suitable  generalizations  of  the  Boolean  case  is  not  settled.  It  is  not  totally  clear  what  a 
conditional  event  should  be  in  this  context  Should  (a|6)  be  the  coset  a  +  R(1  -  b)  or 
the  coset  a  +  R(1  -  b°)l  In  either  case,  since  products  and  sums  of  cosets  are  cosets,  they 
can  be  added  and  multiplied,  but  there  are  choices  to  be  made  for  '  and  V.  As 
mentioned  earlier,  it  seems  not  to  be  known  which  rings  have  the  property  that  products  of 
cosets  are  cosets,  and  of  course  we  are  only  considering  the  commutative  case. 

B.  Three-valued  logics  of  conditionals 

Various  open  issues  have  been  suggested  by  Schay  (1968,  p.  343-344)  as  far  as 
R\R,  viewed  as  the  space  of  generalized  (three-valued)  indicator  functions,  is  concerned. 
Viewing  the  conditional  space  R|i?  as  some  specific  algebraic  structure,  for  example,  as 
a  Stone  algebra,  "probability-like”  measures  on  it  should  be  formulated  in  a  more  thorough 
measure  theoretical  basis.  This  is  somewhat  similar  to  the  situation  in  quantum 
probability  (see,  for  exmple,  Gudder,  1988)  in  which  the  domain  of  a  generalized  measure 
is  an  algebraic  structure  slightly  more  general  than  the  usual  concept  of  rr algebra,  namely 
a  <7-additive  class. 

On  the  other  hand,  one  might  ask  what  would  R\R  be,  as  an  algebraic  structure,  if 
instead  of  using  Lukasiewicz’s  three-valued  logic  (corresponding  to  logical  operations  on 
R\R  as  developed  in  Chapter  3),  one  started  with  either  Schay’s  first  or  second  system,  or 
with  Sobocinski's  or  Bochvar’s  three-valued  logic? 

In  Chapter  3  we  established  the  connection  between  logical  operations  on  the 
conditional  space  i?  |i?  and  truth  tables  in  three-valued  logic.  It  might  be  interesting  to 
explore  the  situation  in  n-valued  logics  (n  >  3).  Logical  operators  on  R\R,  as  developed 
in  Chapter  3,  lead  to  a  well-known  three-valued  logic,  namely  that  of  Lukasiewicz.  The 
algebraic  structure  of  i?|R  so  obtained  is  a  modification  of  Koopman’s  non-totally 
comparable  conditional  qualitative  probability  structure,  (Koopman  (1940,  1964)). 
Referring  to  the  excellent  analysis  and  summary  by  Fine  (1973,  p.  183-196),  the  order 
relation  on  R|R,  as  defined  in  Chapter  3,  can  be  seen  to  satisfy  essentially  all  but  two  of 
Koopman's  axioms.  Additional  work  should  be  carried  out  for  this  aspect  of  conditional 
event  algebra,  and  should  focus  on  the  basic  equivalence  (not  just  implication)  between 
the  partial  order  on  R\R  and  the  numerical  partial  order  on  corresponding  conditional 
probabilities.  By  proving  that  there  is  a  bijection  between  the  class  of  all  three -valued 
logics  and  logical  operators  on  R\R,  the  search  for  operators  on  R\R  might  begin  by 
examining  the  class  of  all  possible  three-valued  logics.  For  example,  it  turns  out  that 
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Schay’s  system  of  logical  operators  on  R\R  corresponds  to  Bochvar’s  three-valued  logic, 
while  Schay's  other  system  (as  well  as  Adams'  and  Calabrese's)  corresponds  to 
Sobocin ski's.  This  should  be  viewed  as  a  healthy  situation  in  reasoning  with  uncertain 
conditional  data,  rather  than  a  divergence  of  opinion.  This  is  similar  to  the  debate  about 
choices  of  uncertainty  measures  in  AI,  and  to  the  choice  of  logical  systems  in  fuzzy  logic. 
For  the  latter,  each  logical  system  in  fuzzy  logic  is  modeled  by  a  triple  (N,  T,  S),  where 
N  is  a  negation  operator,  T  is  a  r- norm  and  S  its  dual  r-conorm.  The  art  of  modeling  is 
delicate.  For  example,  based  on  three-valued  logic,  a  form  of  fuzzy  conditionals  was 
adopted  in  Chapter  7.  The  choice  of  the  copula  min  was  suggested  since  it  is  the  simplest 
one.  Other  choices  can  be  motivated  on  an  empirical  basis.  For  example,  if  "conjunction" 
is  to  be  modeled  mathematically  in  a  given  problem,  and  if  there  is  some  randomness 
involved  in  the  gathered  data,  one  can  pick  a  copula  for  a  r-  :orm,  and  view  the  modeling 
problem  as  a  non-parametric  statistical  estimation  problem,  and  estimate  the  joint 
distribution  function  from  data  on  marginals. 

This  seems  appropriate  in  problems  such  as  modeling  activation  functions  in  neural 
networks.  Indeed,  basically  the  architecture  of  an  artificial  neural  network  can  be  placed 
within  the  theory  of  approximations  of  functions  of  several  variables.  (See,  for  example, 
Lorentz,  1966.)  More  specifically,  it  is  related  to  K>.  Imogorov's  theorem  on  representation 
of  fimetins  of  several  variables  by  superpositions  of  functions  of  fewer  variables  (Lorentz, 
1966,  chapter  11,  or  Vitushkin,  1978).  As  such,  statistical  estimation  procedures  for 
remi-parametric  models  can  be  used  as  learning  rules.  It  is  interesting  to  note  that  the 
popular  back-propagation  training  algorithm  in  neural  networks  bears  some  close 
relationship  with  backfitting  procedures  in  projection  pursuit  regression  (see  Huber,  1985). 
It  seems  that  a  fundamental  question  in  the  field  of  neural  networks  is  this.  Given  a  class 
of  functions,  not  necessarily  completely  specified.,  how  to  design  an  efficient  artificial 
neural  network  to  "process"  any  member  of  this  class? 

In  a  recent  personal  communication,  Hestir  (1990)  showed  that  extreme  points  of  the 
space  of  copulas  (identified  as  probability  measures  on  the  unit  square  with  uniform 
marginals)  can  be  characterized,  so  that  he  above  estimation  problem  might  be  feasible. 
The  space  of  copulas  is  a  compact,  convex  space  with  the  topology  of  weak  convergence 
of  measures. 

C.  Non-monotonic  entailments  on  conditionals 

i 

When  probability  is  used  as  a  quantification  of  uncertainty,  an  extension  of 
Probability  Logic  is  needed  for  R\R.  The  resulting  logic  is  called  a  Conditional 
Probability  Logic  ^Chapter  6).  Conditional  Probability  Logic  should  be  extended  from 
■‘he  sentential  level  to  first  order  predicate  calculus. 
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At  a  pragmatic  level,  non-monotonic  entailment  relations  need  to  be  specified  as  far 
as  common  sense  reasoning  is  concerned.  Some  aspects  of  this  problem  were  discussed  in 
this  Chapter.  This  difficult  and  important  issue  in  mathematical  logic  should  be  further 
investigated.  See  also  the  recent  work  of  the  group  Lea  Sombe  (1990). 

D.  Higher  order  conditioning 

Once  conditionals  on  Boolean  rings  are  defined,  it  is  natural,  at  least  from  a 
mathematical  standpoint,  to  consider  conditionals  of  conditionals.  See,  however,  Pfanzagl 
(1971).  In  Chapters  7  and  8,  we  have  touched  upon  this  problem,  both  from  the  syntactic 
and  semantic  viewpoints. 

The  material  in  Section  8.1  is  incomplete.  Conditionals  are  defined  on  the  space 
R\R,  yielding  the  space  of  iterated  conditionals  (i?|J?)|(i?|i?).  The  basic  result  is 
Corollary  1  in  that  section,  asserting  that  they  consist  of  intervals  in  the  Stone  Algebra 
R\R.  This  relied  heavily  on  the  fact  that  R\R  is  relatively  pseudo-complemented.  This 
relative  pseudo-complementation  played  the  role  of  material  implication.  There,  we  also 
touched  on  a  way  to  assign  "probabilities"  to  these  iterated  conditionals.  An  algebra  of 
these  iterated  conditionals  has  not  been  developed.  No  binary  or  unary  operations  on 
(Z?|i?)|(/?|i?)  were  defined  and  investigated.  Much  work  remains  to  be  done  to  clarify 
the  issue  and  to  obtain  a  more  satisfying  theory  of  higher-order  conditioning.  Doing  so 
could  be  rewarding,  and  result  in  a  tractable  an  important  algebraic  system,  not  only  for  its 
modeling  of  higher  order  conditioning,  but  for  its  possible  connections  with  higher  order 
logics. 

E.  Fuzzy  conditionals  and  probability  qualification 

In  view  of  the  success  of  fuzzy  logic  in  AI,  we  have  devoted  the  entire  Chapter  7  to 
the  extension  of  ordinary  conditionals  to  the  fuzzy  case.  Our  semantic  approach  to  fuzzy 
conditionals  is  novel.  It  is  motivated  by  a  connection  between  membership  functions  and 
random  sets,  namely  randomization  of  level-sets  associated  with  membership  functions  of 
fuzzy  sets.  The  simplest  copula  min  was  chosen  to  define  membership  functions  of  fuzzy 
conditionals,  which  turn  out  to  be  interval-valued  fuzzy  sets.  As  in  fuzzy  logic,  other 
choices  of  copulas  are  possible.  It  might  be  of  interest  to  compare  fuzzy  conditionals  as 
perceived  here  with  various  concepts  of  conditional  possibility  distributions  in  the 
literature.  Also,  inference  with  fuzzy  conditionals,  for  example  fuzzy  implication 
operators,  should  be  investigated  further  for  applications.  See,  for  example,  Smets,  1990; 
Goodman,  J  990. 
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F.  Modal  logic 

Since  conditioning  was  shown  not  to  be  just  a  primitive  concept,  best  described  by 
axioms,  but  rather  can  be  analyzed  further  through  the  power  class  structure  of  the  space 
of  conditionals,  one  can  inquire  whether  other  model  forms  (see,  for  example,  Rescher, 
1968)  can  be  analogously  investigated,  including  deontic,  alethic  and  voluntative  modes, 
among  others.  This  could  use  reduction  of  such  forms  to  conditionals,  using  a  synthetic 
approach  as  in  the  analysis  of  the  latter,  not  a  top-down  external  approach  as  taken  by 
Palmer  (1986)  nor  the  formal  logical  stand  of  Searle  and  Vanderveken  (1985).  See  also 
Ruspini  (1989). 

G.  Non-additive  uncertainty  measure 

As  mentioned  several  times  in  this  monograph,  especially  in  Chapter  5,  conditionals, 
as  cosets  of  Boolean  rings,  were  derived  under  the  condition  of  compatibility  with 
conditional  probability.  Here,  Lewis'  Triviality  Result  plays  an  important  role.  It  is  clear 
that  this  result  depends  heavily  on  the  additivity  property  of  probability  measures.  If 
probability  measures,  viewed  as  set  functions,  are  generalized  to,  say,  Dempster-Shafer’s 
belief  functions,  which  are  non-additive  set  functions,  then  Lewis’  Triviality  Result  does 
not  hold.  Indeed,  as  pointed  out  in  Chapter  5,  material  implication  on  Boolean  rings  is 
compatible  with  conditional  belief  assignments.  Belief  functions  are  not  the  only 
non-additive  set  functions  considered  in  the  literature  of  artificial  intelligence.  Fuzzy 
measures  (for  example,  Sugeno,  1974),  or  decomposable  measures  with  respect  to 
t-conorms  (for  example,  Weber,  1984)  are  non-additive  set  functions.  Although,  in  many 
cases,  non-additive  measures  can  be  transformed  into  additive  ones,  in  the  spirit  of 
Lindle>'s  admissibility  (Lindley,  1982;  Goodman,  Nguyen,  and  Rogers,  1990),  an  analysis 
of  conditional  events  compatible  with  a  given  class  of  uncertainly  measures  might  be  of 
interest,  as  a  way  to  specify,  at  the  syntex  level,  the  "non-standard"  logics  underlying  the 
semantic  aspects  when  reasoning  with  various  types  of  uncertainty. 
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